Tag Archives: SAS Replacement. SAS. R. Bridge to R. WPS. Open Source

Extending the Bridge to R – Statistical Processing

We put the Bridge to R out on the internets for a free 60 day trial to try to gauge some interest from users about (1) the usage of R and (2) to gain some exposure for the software. So far, there has been more downloads for the WPS version than the SAS version of the Bridge. Although it’s only been available for four days now, I did receive an interesting email on Sunday morning. The question posed in the email was if it would be possible to create a standardized set of macros and a standard calling and implementation convention as part of the Bridge to R that would allow developers to create statistical macros using R as the calculation engine.

I have to admit, this has me really intrigued. So what would be involved in creating a standardized suite of macros that other developers can use to create user defined routines that would use R from either WPS or SAS? Personally, I can’t see the value in replicating anything that already exists in the SAS/Base or WPS-Core library. I can see value in replicating some of the most popular statistical procedures as a macro that takes a predefined set of parameters. As an example, let’s take a look at what would be required to create a forecast using the R library forecast created by Robert Hyndman.

if we are trying to forecast the variable pop and have another variable called startyr, all we really need to do is to pass to R the start date of the forecast, the frequency of the series, and the variable we want to forecast (pop). If start = 1970 and the frequency of the series is 1, then the R code would like:

yr =ts(year, start=1970, freq=1)

est <- ets(pop)

accuracy(est)

 

fit <- fitted(est)

res <- residuals(est)

pred <- forecast(est)

fit <- as.data.frame(fit)

res <- as.data.frame(res)

pop <- as.data.frame(pop)

yr <- as.data.frame(yr)

We can easily generate this code to run within the Bridge to R and using the macro language populate parameters. A simple template that would run the R code would like:

%let startyr = 1970; *--> do some preprocessing to get rid of this;

 

%Rstart(dataformat=csv,data=mydata,rGraphicsViewer=False);

datalines4;

 

library(forecast)

attach(&data)

 

yr =ts(year,start=&startyr, freq=&freq)

 

est <- ets(&var)

accuracy(est)

fit <- fitted(est)

res <- residuals(est)

pred <- forecast(est)

fit <- as.data.frame(fit)

res <- as.data.frame(res)

&var <- as.data.frame(&Var)

&date <- as.data.frame(&date)

 

;;;;

%Rstop(import=&var fit res &date pred);

The Bridge will take care of validating the existence of the data sets as well as reading in the output (log and list files) from R including importing the R data frames back into WPS or SAS. What would have to be added are routines to parse out variable names from a list (easily done), check that they exist in the data set, checking that the variables are of the correct type (alphanumeric, numeric) to be passed to R and the handling of missing values.

Thus, a very simple macro that a developer might implement for the automated forecasting of a univariate time series might look like:

%AutoForecast(dataset=mydata,

          date=Yr,

          Freq=1    

          var= pop,

          output= dataset that contains all the forecasted values);

Of course, the above example is very elemental. The developer would probably want to add some bells and whistles such as being able to suppress the printing of the output, creating plots and capturing them into a catalog, processing multiple variables, etc…

The value of creating a standardized set of macros and routines for statistical developers includes:

1) Ability to create a custom statistical routine in WPS or SAS that is not possible with just WPS or SAS by itself.

2. Inexpensively distribute these custom routines without requiring users to have specific statistical libraries.

3. Cost savings where one doesn’t have to license the SAS/Toolkit.

4. Reduce cost by replacing those statistical libraries where your organization uses just one or two procedures.

5. Use it as a basis for developing cost effective vertical market applications because your customers will not have to license additional modules/libraries from SAS.

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.