Tag Archives: Bridge to R for WPS

Clean Work Utility for WPS

Another new feature that will be included in the Bridge to R is the added functionality to clean temp workspaces. As most of you are aware, orphaned directories and files can take a large amount of space out of your temp work folder.

CleanWork is another Windows only application (at this time) that can be run at anytime. CleanWork will only delete old orphaned files and directories and not a currently used temp folder. A network administrator could easily setup CleanWork to run via a scheduler on a daily basis or even more often. This would ensure that your temp work space always has the most available space.

CleanWork is easy to use. For example:

%CleanWork;

will find your systems temp work space and delete all the old folders and files.

CleanWork will be available in the next release of the Bridge to R which is expected later next month.

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.

How Important are Gecoding Services?

I’ve been thinking about writing a Geocoder for WPS for some time now and have elements of it done. One thing that strikes me as being problematic ,and I should have thought about this aspect earlier, is the issue of keeping address and map tables up-to-date. I’m not sure how often these tables would have to be up-dated but the amount of data is tremendous that would have to be downloaded on say a quarterly basis.

So, I’m now reconsidering the direction I took and think a better way is to simply allow the user (i.e. a WPS licensee holder who purchased their WPS license from MineQuest) to move their address data to a remote service for geocoding. There are a number of free and pay services offering geocoding and I don’t like the idea of paying for such a thing if it is at all avoidable.

Looking at the pay services, Bulk Geocoder has pricing on their website and they charge $500 for 100,000 records. I personally think that’s a substantial amount of money to pay. The free services limit the amount of data (i.e. the number of addresses) they will geocode for free but some of them are quite lenient. For example, Microsoft will allow a maximum of 200,000 addresses to be geocoded at a time. That’s pretty decent and I suspect that would handle the bulk requirements of many WPS license holders who need to geocode addresses.

Since this would be a bulk geocoding system, i.e. not processing a record interactively, a web service written for and using WPS to communicate with the geocoding organization would be very useful and quite economic. SAS likes to add a tax to companies who are Data Service Providers and many of these so called DSP’s are in the advertising and marketing industries. If they can realize a savings over using the SAS System, then this is something that we need to strongly consider offering.

There’s some time needed for further research and I have questions on turnaround time, but if the turnaround time is low, then the cost savings over using an outside geocoding service or if you are a SAS user and have to use SAS/GIS (think $$$) then this would be a very reasonable solution.

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.

Bridge to R for Linux

I’ve been working on porting over the Bridge to R from the Windows platform over to Linux. I’m happy to say that I have most of the functionality of the Bridge working now. It’s been a trying few days chasing down little issues that have cropped up between the two platforms.

One of the problems I encountered is that I use mixed case when creating filenames. I like camel case variable names for readability and I tend to extend that to filenames as well. I’m not real persistent in my naming conventions when it comes to CamelCase and of course, Linux requires that the case be exact. Chalk one up for the Windows platform in my opinion.

The other more hideous issue that I discovered is where I have a file name that is constructed under WPS on Windows such as:

“//myfolder/myprg.r “

is different from

“//myfolder/myprg.r”

On the windows platform, that blank space on the end is either ignored or compressed out by the operating system. That’s not the case under Linux.

So now the Bridge running on Linux can convert WPS data sets to R data frames, execute R jobs, read the R output into the appropriate WPS log or listing files, and convert R data frames back into WPS data sets. I still need to do some additional testing as well as create some documentation for it. I’m going to use the existing Bridge to R documentation and modify it to reflect the differences between the platforms and hence, there will be a Bridge to R for Linux document.

My current development of the Bridge has been under Fedora 32 x86. I’m assuming that Linux is Linux and this will work on other flavors of Linux. I’m also hoping that the same code base will operate on the Mac OS and Solaris as well as AIX. That may be hopeful thinking but I’ve tried to make this as vanilla as possible for portability reasons.

This release will mark the first time that the Bridge will not be available for both SAS and WPS. I don’t have access to SAS on Linux for testing and porting and am pretty sure that I’m not going to be licensing a copy of SAS for just this purpose. So, the Bridge to R for Linux will be only available on WPS.

I expect to have a commercial release of the Bridge to R for Linux by the end of the month. And just as with the Windows versions, if you purchase your Linux WPS licenses from MineQuest, we will provide the Bridge to R for Linux free.

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.

Creating an API for the Bridge to R

Today, we started writing/developing an API for the Bridge to R that will allow SAS and WPS developers to create pseudo procedures written in R and callable by WPS or SAS using the Bridge. What we are hoping to achieve is the creation of callable routines using the macro language a lot easier by doing the drudge work for the programmer.

We’re still specifying all the API’s that will be needed and we’re almost certain that we will have to add additional ones in the future, but this will make creating callable statistical routines using R and WPS or SAS easier, less error prone and quicker. Some of the API’s will handle such things as getting the data set names that will be passed to R to be converted to data frames, the number and count of variables on parameter lists, and the initialization and cleanup of the scripting and execution environment.

The API’s will be part of the Bridge so if you as a developer decide to create a statistical routine, you will not have to purchase any additional software. All your users or clients will need will be the Bridge to R.

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.

Notes on the Next Release of the Bridge to R

Thought I’d write a little about what MineQuest has been working on for the next release of the Bridge to R. We just wrapped up the programming for the latest release and I’m pretty happy with what is right around the corner for our current users and new users as well.

First, we’ve added the ability to export WPS data sets directly into your R workspace as R data frames. We’ve always provided support for taking a single WPS data set into a data frame but this release makes it easy to export multiple data sets into R. This actually required a lot of effort to do and is based on a request from numerous customers who are using the Bridge.

Originally, I envisioned that people would use the Bridge in a similar way that they would use a WPS or SAS procedure. They would create a data set that contained all the variables they needed for a specific statistical routine in R and use that for their analysis. But I was easily convinced that this was short-sighted because it didn’t allow for the analyst to move all the data sets needed for such things as matrix operations into the R work space.

The other thing that convinced me that this was necessary is that I recently became aware of a book called "A Practical Guide to Geostatisical Mapping" by Tomislav Hengl. Tomislav writes about mapping and to create maps, you need to have multiple data sets. You need one that contains the data to be displayed and a data set that contains the coordinate files. I eventually want to provide some mapping data sets for the Bridge to R so one can create maps using the Bridge so the ability to read multiple WPS data sets is necessary.

Exporting WPS data sets to R is accomplished by specifying the names of the WPS data sets in the %Rstart() clause. Here’s an example:

%Rstart(DataFormat=xpt, data=a b c, rGraphicsViewer=No)

The data sets a, b, c are automatically exported to R dataframes for you without any other commands or programming.

The other improvement in the next release of the Bridge to R is that you can import multiple data frames from your R session to WPS. This is easily done and just requires the analyst to list the R data frames on the Import= clause of the %Rstop macro to bring all the frames back into WPS. For example:

%Rstop(import=dataframe1 dataframe2 dataframe3);

where dataframe1, dataframe2 and dataframe3 are the names of the R data frames that you want to import back into WPS. This will create three WPS data sets named dataframe1, dataframe2, and dataframe3, respectively.

We’ve also added more error checking to the Bridge. We now catch errors when using the XPORT transport format. One problem with using XPORT as a transport format is that it’s limited to eight character variable names. We now examine all the WPS data sets before they are exported to make sure that the variable names are eight characters or less in length and if not, we throw an exception, report on it and don’t try to process the R code because we already know it won’t execute.

By the way, the reason we support the transport format is due to customer requests from those in the biostats area. They wanted to make sure that they can pass a possible data processing audit and they felt much more comfortable with the XPORT format than passing data via a CSV format.

So what’s left? With the next release of the Bridge to R (by the end of April 2010) we are updating the documentation and adding more sample R programs that demonstrate how to use the Bridge. We are adding another half dozen R graphic sample programs and a few more statistical type programs as well.

I’m very confident that the Bridge to R when used with WPS can complement the WPS system by allowing the analysts to do just about any kind of graphics or statistical procedures all from within the WPS IDE. With the low cost of the Bridge (free if you license WPS from MineQuest) and the use of open source R, you can replace SAS/IML, SAS/Graph and many of the SAS statistical modules and be state-of-the-art on your analytics platform.

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.

How to Perform a SAS Replacement

So you need to move away from SAS because of pricing pressures or licensing issues and you’re given responsibility for the project. How would you go about finding a SAS replacement and what is involved in the conversion/migration? Many sites are able to assess the advantages of WPS in replacing SAS in 30 days or less.

The idea behind doing the replacement the way I’m proposing is to try to conserve your company’s investment in knowledge and skills of the SAS programmer and developer. Also, quite a few companies have a sizeable inventory in SAS programs that would be foolish to toss aside if you don’t have to. Hence, the only change is the software vendor and a much lower cost! These are the steps I recommend anyone follow to migrate from SAS to WPS, which is a SAS replacement (i.e. a SAS language compatible system) that runs on Windows desktops and Servers as well as z/OS and soon the Unix family of platforms.

1. Make an assessment of the programming skills and the financial investment that your organization has in SAS programmers. If it’s at all sizeable, strongly consider not abandoning the language but how to lower your cost by using other software technologies that use the same language and PROCS.

2. Segregate your users by their roles and how they use the SAS software. Make two camps… (1) Statistical Modelers and (2) Reporting Analysts and data managers.

3. Take a look at the platforms that you are currently running SAS on and how SAS is being used. Is it for ETL and Reporting or are you doing statistical model development? My own observations are that 98% of the sites I work with don’t do much in the way of statistical model development but use SAS for ETL, summarization of data, reporting and data preparation for other software packages. Make sure you have your current annual SAS license fees available for comparison purposes.

4. Take an inventory of all the organizations SAS programs and make a copy — putting them in a separate folder.

5. Acquire a free WPS trial license from MineQuest by visiting the website at http://www.minequest.com/Pricing.html or by calling (614) 457-3714.

6. Run the SAS programs using the WPS Script Compatibility Analyzer that you copied in step #4 and view the output to get an idea of how compatible your SAS programs and using WPS are. For the reporting analysts and data managers identified in step #2 above, expect to see near 100% compatibility.

7. Take a look at any statistical PROCS that your company is using and determine how feasible it is to run these models using R. You can use WPS and R by using The Bridge to R for WPS. The Bridge is available at no cost when you license your WPS software from MineQuest.

After you have performed these steps, you are in a good position to determine the economic feasibility of replacing SAS with WPS. WPS is fast, efficient and a no-brainer for those who already program in the language. Think outside the box when trying to cut SAS costs. If you are just using SAS on a server for ETL, Reporting and data summarization and preparation, you can easily save money by converting to WPS. If you are a Data Service Provider, we often save clients 80% on their software costs. The same goes for sites who are using SAS/IntrNet. WPS-Web can be a good replacement for SAS/IntrNet and you can expose your organizations data to the outside world and not pay exorbitant license fees for doing so.