Thought I’d write a little about what MineQuest has been working on for the next release of the Bridge to R. We just wrapped up the programming for the latest release and I’m pretty happy with what is right around the corner for our current users and new users as well.
First, we’ve added the ability to export WPS data sets directly into your R workspace as R data frames. We’ve always provided support for taking a single WPS data set into a data frame but this release makes it easy to export multiple data sets into R. This actually required a lot of effort to do and is based on a request from numerous customers who are using the Bridge.
Originally, I envisioned that people would use the Bridge in a similar way that they would use a WPS or SAS procedure. They would create a data set that contained all the variables they needed for a specific statistical routine in R and use that for their analysis. But I was easily convinced that this was short-sighted because it didn’t allow for the analyst to move all the data sets needed for such things as matrix operations into the R work space.
The other thing that convinced me that this was necessary is that I recently became aware of a book called "A Practical Guide to Geostatisical Mapping" by Tomislav Hengl. Tomislav writes about mapping and to create maps, you need to have multiple data sets. You need one that contains the data to be displayed and a data set that contains the coordinate files. I eventually want to provide some mapping data sets for the Bridge to R so one can create maps using the Bridge so the ability to read multiple WPS data sets is necessary.
Exporting WPS data sets to R is accomplished by specifying the names of the WPS data sets in the %Rstart() clause. Here’s an example:
%Rstart(DataFormat=xpt, data=a b c, rGraphicsViewer=No)
The data sets a, b, c are automatically exported to R dataframes for you without any other commands or programming.
The other improvement in the next release of the Bridge to R is that you can import multiple data frames from your R session to WPS. This is easily done and just requires the analyst to list the R data frames on the Import= clause of the %Rstop macro to bring all the frames back into WPS. For example:
%Rstop(import=dataframe1 dataframe2 dataframe3);
where dataframe1, dataframe2 and dataframe3 are the names of the R data frames that you want to import back into WPS. This will create three WPS data sets named dataframe1, dataframe2, and dataframe3, respectively.
We’ve also added more error checking to the Bridge. We now catch errors when using the XPORT transport format. One problem with using XPORT as a transport format is that it’s limited to eight character variable names. We now examine all the WPS data sets before they are exported to make sure that the variable names are eight characters or less in length and if not, we throw an exception, report on it and don’t try to process the R code because we already know it won’t execute.
By the way, the reason we support the transport format is due to customer requests from those in the biostats area. They wanted to make sure that they can pass a possible data processing audit and they felt much more comfortable with the XPORT format than passing data via a CSV format.
So what’s left? With the next release of the Bridge to R (by the end of April 2010) we are updating the documentation and adding more sample R programs that demonstrate how to use the Bridge. We are adding another half dozen R graphic sample programs and a few more statistical type programs as well.
I’m very confident that the Bridge to R when used with WPS can complement the WPS system by allowing the analysts to do just about any kind of graphics or statistical procedures all from within the WPS IDE. With the low cost of the Bridge (free if you license WPS from MineQuest) and the use of open source R, you can replace SAS/IML, SAS/Graph and many of the SAS statistical modules and be state-of-the-art on your analytics platform.
About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.