Tag Archives: R

What’s New in WPS v3.1

We have a new release of WPS out the door and we wanted to share the news! This is a major release and includes a number of new features and procedures.

New and Enhanced Communication Features

WPS Link – WPS Link is an interface for the communication between a desktop version of WPS (or a fat client) and a WPS Server. WPS Link implements the Eclipse Workbench that allows a user to submit WPS programs from their desktop to a server. WPS Link also includes a file explorer where a user can store their programs on the server and access them as if they were on their desktop. WPS Link will only talk to a WPS Server and does not provide desktop-to-desktop communications. WPS Link is included as part of your WPS license fee.

WPS Communicate – is a product that allows for the remote submitting and scripting of WPS programs to a server. It differs from WPS Link in that WPS Communicate allows for the Upload and Download of files and data sets programmatically. Communicate will only communicate between a desktop copy of WPS and a Server or Server-to-Server. WPS Communicate does not provide desktop-to-desktop communications. The WPS Communicate client is included in your desktop license and a server connection is included with a server license of WPS.

 

New Procedures

PROC ARIMA – Arima (autoregressive moving averages) is a time series modeling technique to help better understand your data or predict future points in a data series.

PROC EXPAND – is a procedure that allows the WPS user to expand or contract time series data and interpolate missing values as well.

PROC FORECAST – a forecasting module that implements basic forecasting methods that are highly automated. Proc forecast is able to forecast hundreds of series at a time using either separate variables or with the use of the By statement.

PROC HTTP – allows access to remote “cloud-based” files.

PROC JAVAINFO – allows the WPS developer to ascertain information about the Java environment that WPS is using.

PROC KDE – The KDE procedure performs either univariate or bivariate kernel density estimation.

PROC R – Proc R is the first procedure written by World Programming that is unique to WPS. Proc R allows you to execute R code from within the Eclipse environment and to exchange data frames and WPS data sets between the two applications.

PROC Soap – The Proc Soap procedure reads in XML from a file using a fileref and writes XML output to another file that also has a fileref.

PROC VARCLUS – Varclus is a procedure that implements variable reduction by separating variables into non-overlapping groups (i.e. clusters).

PROC X12 – is a procedure that seasonally adjust time series data either monthly or quarterly.

 

New System Features

DBCS – Double Byte Character Support is now available for the first time in this release. DBCS allows for support for languages that have more than 256 characters. Not available on z/OS.

JavaObj – is an interface that allows the WPS and Java developer to run Java Programs from WPS.

Secure Email – WPS now has support for secure email.

 

Database Engine Features

SAP Hana – Support for SAP’s in-memory database is now included as a new access engine.

Actian Matrix – A new database engine for Actian Matrix (formerly Paraccel) has been implemented and is included as a component of the system.

Netezza – WPS engine for Netezza named pipe bulk loading and unloading.

MySQL – Added bulk insert functionality for MySQL.

SSL – added support for SSL (Secure Sockets Layer).

 

Pricing and licensing

Pricing for the new release remains the same starting at $1,311 for a new workstation license. Don’t expect any price increases until the end of the year. In regards to licensing, there are no up-charges for Data Service Providers.

 

Evaluations and Quotes

MineQuest Business Analytics is an Authorized reseller of the World Programming System. Contact us to arrange your free 30 day product evaluation on a desktop or server. Contact us for a quote or to arrange a free 30 day evaluation for any of our products.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

High Performance Workstations for BI

There’s one thing I really enjoy and that’s powerful workstations for performing analytics. It’s fun to play around with and can be insightful to speculate on the design and then build a custom higher-end workstation for running BI applications like WPS and R.

ARS Builds

Every quarter, ARS Technica goes through an exercise where they build three PC’s mainly to assess gaming performance and then do a price vs. performance comparison. There’s a trend that you will soon see after reading a few of these quarterly builds and that is, the graphics card plays a major role in their performance assessment. The CPU, number of cores and fixed storage tend to be minimal when comparing the machines.

This if course will be in contrast to what we want to do for our performance benchmarks. We are looking at a holistic approach of CPU throughput, DISK I/O and graphics for getting the most for the dollar on a workstation build. But ARS does have a lot to recommend when it comes to benchmarking and I think it’s worthwhile including some of their ideas.

What Constitutes a High End Analytics Workstation?

This is an interesting question and one that I will throw out for debate. It’s so easy to get caught up in spending thousands of dollars, if not ten thousand dollars (see the next section) for a work station. One thing that even the casual observer will soon notice is that being on the bleeding edge is a very expensive proposition. It’s an old adage that you are only as good as your tools. There’s also the adage that it’s a poor craftsman that blames his tools. In the BI world, especially when speed means success, it’s important to have good tools.

As a basis for what constitutes a high end workstation, I will offer the following as a point of entry.

  • At least 4 Logical CPU’s.
  • At least 8GB of RAM, preferably 16GB to 32GB.
  • Multiple hard drives for OS, temporary workspace and permanent data set storage.
  • A graphics card that can be used for more than displaying graphics, i.e. parallel computing.
  • A large display – 24” capable of at least 1920×1080.

As a mid-tier solution, I would think that a workstation comprised of the following components would be ideal.

  • At least 8 Logical CPU’s.
  • A minimum of 16GB of RAM.
  • Multiple hard drives for OS, temporary workspace and permanent data set storage with emphasis on RAID storage solutions and SSD Caching.
  • A graphics card that can be used for more than displaying graphics, i.e. parallel computing.
  • A large display – 24” capable of at least 1920×1080.

As a high end solution, I would think that a workstation built with the following hardware would be close to ultimate for many (if not most) analysts.

  • Eight to 16 Logical CPU’s – Xeon Class (or possible step down to an Intel I7).
  • A minimum of 32GB of RAM and up to 64GB.
  • Multiple hard drives for OS, temporary workspace and permanent data set storage with emphasis on RAID storage solutions and SSD Caching.
  • A graphics card that can be used for more than displaying graphics, i.e. parallel computing.
  • Multiple 24” displays capable of at least 1920×1200 each.

I do have a bias towards hardware that is upgradeable. All-in-one solutions tend to be one shot deals and thus expensive. I like upgradability for graphics cards, memory, hard drives and even CPU’s. Expandability can save you thousands of dollars over a period of a few years.

The New Mac Pro – a Game Changer?

The new Mac Pro is pretty radical from a number of perspectives. It’s obviously built for video editing but its small size is radical in my opinion. As a Business Analytics computer it offers some intriguing prospects. You have multiple cores, lots of RAM, high end graphics but limited internal storage. That’s the main criticism that I have about the new Mac Pro. The base machine comes with 256GB of storage and that’s not much for handling large data sets. You are forced to go to external storage solutions to be able to process large data sets. Although I’ve not priced out the cost of adding external storage, I’m sure it’s not inexpensive.

Benchmarks

This is a tough one for me because so many organizations have such an array of hardware and some benchmarks are going to require hardware that has specific capabilities. For example, Graphics Cards that are CUDA enabled to do parallel processing in R. Or the fact that we use the Bridge to R for invoking R code and the Bridge to R only runs on WPS (and not SAS).

I did write a benchmark a while ago that I like a lot. It provides information on the hardware platform (i.e. amount of memory and the number of LCPU’s available) and just runs the basic suite of PROCS that I know is available in both WPS and SAS. Moving to more statistically oriented PROC’s such as Logistic and GLM may be difficult because SAS license holders may not have the statistical libraries necessary to run the tests. That’s a major drawback to licensing the SAS System. You are nickel and dimed to death all the time. The alternative to this is to have a Workstation benchmark that is specific to WPS.

Perhaps the benchmark can be written where it tests if certain PROCS and Libraries are available and also determine if the hardware required is present (such as CUDA processors) to run that specific benchmark. Really, the idea is to determine the performance of the specific software for a specific set of hardware and not a comparison between R, WPS and SAS.

Price and Performance Metrics

One aspect of ARS that I really like is when they do their benchmarks, they calculate out the cost comparison for each build. They often base this on hardware pricing at the time of the benchmark. What they don’t do is price in the cost of the software for such things as video editing, etc… I think it’s important to show the cost with both hardware and software as a performance metric benchmark.

Moving Forward

I’m going to take some time and modify the WPS Workstation Benchmark Program that I wrote so that it doesn’t spew out so much unnecessary output into the listing window. I would like it to just show the output from the benchmark report. I think it would also be prudent to see if some R code could be included in the benchmark and compare and contrast the performance if there are some CUDA cores available for assisting in the computations.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Plotting Points on a Street Level Map using the Bridge to R and WPS

In the last few installments of this blog, I have shown how you can use WPS and the Bridge to R to calculate drive distances, geocode records and pull down a map from Google maps. I want to use this post to pull all this together and show how you can geocode your addresses and plot them on a street level map.

First some background you need to know about using Google for geocoding and mapping. There are limits to what Google will allow you to do with their services before they want you to start paying. You can geocode 2,500 records a day for free. You can pull down 25,000 maps a day for free. Once you start moving past these limits, there are fees involved.

One thing that you should probably start to consider is caching records locally that have been geocoded so that you don’t have to go back to the Google geocoder every time you want to plot some points on a map. I could easily run through 2,500 addresses in a day. The limitations on the number of maps is just not an issue for me. I think 25,000 maps a day is a very liberal offering for the kind of work that I would want to use the service for.

In the sample code below, I split the mapping process into two components for ease in understanding the entire process. I first geocode the file to get the latitude and longitude for each record. The second part of the process is creating a map and using the lat’s and long’s to plot points on the map. We could have put this into a single step but it wouldn’t be as clear or as flexible.

Without further ado, here’s the code using the Bridge to R and WPS.

data gasstations;
input company $1-29 address $30-52 city $53-64 state $66-67;
addr2geocode=trim(address)||', '||trim(city)||', '||trim(state);
cards;
Citgo Gas Station            5189 28th St Se        Grand Rapids MI
28th Street BP               5155 28th St Se        Grand Rapids MI
Twenty-Eighth Street C Store 5556 28th St Se        Grand Rapids MI
Speedway                     4045 28th St Se        Grand Rapids MI
Speedway                     2305 E Paris Ave Se    Grand Rapids MI
Superamerica                 2305 E Paris Ave Se    Grand Rapids MI
Shell Food Mart              3960 28th St Se        Grand Rapids MI
Admiral Petroleum            3927 28th St Se        Grand Rapids MI
Cascade C Store              4591 Cascade Rd Se     Grand Rapids MI
Friendly Food Shops          6799 Cascade Rd Se     Grand Rapids MI
Family Fare Quick Stop       6799 Cascade Rd Se     Grand Rapids MI
Cascade Citgo                6820 Cascade Rd Se     Grand Rapids MI
Dutton Fuel Mart LLC         2560 E Beltline Ave Se Grand Rapids MI
Centerpointe Marathon        2560 E Beltline Ave Se Grand Rapids MI
Shell Food Mart              2600 E Beltline Ave Se Grand Rapids MI
Speedway                     4018 Cascade Rd Se     Grand Rapids MI
Grand Rapids Gas Incorporated3214 28th St Se        Grand Rapids MI
Cascade Shell                4033 Cascade Rd Se     Grand Rapids MI
Speedway                     4665 44th St Se        Kentwood     MI
Super Petroleum Incorporated 2411 28th St Se        Grand Rapids MI
;;;;
run;


*--> Geocode the addresses using the Google Geocoder. Keep the geocoded records
     in the output dataset names locs for further processing.;

%rstart(dataformat=csv,data=gasstations,rGraphicsFile=);
datalines4;

## options(repos=structure(c(CRAN="http://cran.case.edu/")))
## install.packages("ggmap", dependencies = FALSE)

   attach(gasstations)

   library(ggmap)

   gaddress <- as.character(gasstations$addr2geocode)
   locs <- geocode(gaddress,output="more")

;;;;
%rstop(import=locs);


*--> Pull a Google map that is centered on a particular address and plot the locations
     on the map. Use the data set that was created (locs) above that contains the 
     lat and longs to plot the points.;
     
Title 'Gas Stations on or Near 28th Street';
Title2 'Grand Rapids, Michigan';     
     
%rstart(dataformat=csv,data=locs,rGraphicsFile=);
datalines4;

attach(locs)
addr <- locs;

library(ggmap)

map.center <- geocode('3960 28th St Se, Grand Rapids, MI');

 grmap <- qmap(c(lon=map.center$lon, lat=map.center$lat), zoom = 13,color = 'color', legend = 'topleft')
          grmap +geom_point(aes(x = lon, y = lat, size=3.0), data = addr)

;;;;
%rstop(import=);

The map that is created looks like this:

GR_Stations

I cropped this down a bit and got rid of the borders so that it would be easier to view on this blog. Note the black points on the map that indicate the locations of the gas stations. We could continue this exercise by plotting a label to the points with the names of the service stations but that would be a good exercise for the reader who wants to learn more about using ggmap and street level mapping.

If you want to learn more about ggmap and street level mapping, I encourage you to take a look at the following document, “ggmap: Spatial Visualization with ggplot2 – the R Journal” and can be viewed in PDF format here. What I have presented is really a quick and dirty set of examples that just begin to scratch the surface of what ggmap can do for you.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Creating a Street Level Map with WPS and the Bridge to R

Creating a street level map using the Bridge to R and WPS is actually pretty easy. As in our other examples (see the two previous blogs) we again use ggmap to pull down a map from Google Maps and display it using HTML. Amazingly, this only takes four lines of R code. Here’s an example:

 

%rstart(dataformat=man,data=,rGraphicsFile=);
datalines4;

   library(ggmap)

   bp <- "4045 28th St Se, Grand Rapids, MI, USA"
   qmap(bp, zoom=12)
   print(bp)

;;;;
%rstop(import=);

The code is fairly easy to follow. We load the ggmap library that will do most of the work for us. We center the map using the address “4045 28th St Se, Grand Rapids, MI, USA”. The next line queries the map with a specified zoom level (we are using zoom level 12). Finally, we print the map using the print function.

This is what the map looks like.

b2rplt_1700486050_2_1

 

We can actually take this a bit further. Instead of using a known address, we can us a place of interest for querying and creating the map. If we replace the address in the code above with “White House, Washington DC, USA” we get a map like below.

b2rplt_1700487946_3_1

So know we have seen how easy it is to pull down a map from Google using ggmap and the Bridge to R for WPS. If you have a copy of the Bridge to R, I recommend you play with the demonstration programs to get an idea of what you can do with the software and the mapping service. It’s always fun to see what gets rendered using ggmap, R and the Bridge to R.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Geocoding with WPS and the Bridge to R

 

One of the things that I truly enjoy is the flexibility of the language of SAS and how well WPS allows you to integrate the product with other services. One aspect of research, whether you are in the social sciences, marketing or some other area of study is the use and application of location data. 

I’m not talking necessarily about getting the data from your cell phone on where you are or have been, but in taking address data and using it to create a business advantage. Visualizing data on a map is important for many people but it’s often a laborious task to get all the data enhanced so that it can be mapped or plotted. Specifically, I’m talking about taking an address and finding additional information such as political districts and latitude and longitude for each address.

Using the Bridge to R and WPS, we can use R to geocode our data. In our example, we will use ggmap which was developed and written by Hadley Wickham and David Kahle. It is a truly amazing application and every time I use it, I learn something new.

In this example, I have 20 records that have the name and address of gas stations in Cascade Township which is a part of Grand Rapids, Michigan. What we want to do is geocode these 20 records to find their latitude and longitude. Below is the entire code snippet to do just that.

data gasstations;
input company $1-29 address $30-52 city $53-64 state $66-67;
addr2geocode=trim(address)||', '||trim(city)||', '||trim(state);
cards;
Citgo Gas Station            5189 28th St Se        Grand Rapids MI
28th Street BP               5155 28th St Se        Grand Rapids MI
Twenty-Eighth Street C Store 5556 28th St Se        Grand Rapids MI
Speedway                     4045 28th St Se        Grand Rapids MI
Speedway                     2305 E Paris Ave Se    Grand Rapids MI
Superamerica                 2305 E Paris Ave Se    Grand Rapids MI
Shell Food Mart              3960 28th St Se        Grand Rapids MI
Admiral Petroleum            3927 28th St Se        Grand Rapids MI
Cascade C Store              4591 Cascade Rd Se     Grand Rapids MI
Friendly Food Shops          6799 Cascade Rd Se     Grand Rapids MI
Family Fare Quick Stop       6799 Cascade Rd Se     Grand Rapids MI
Cascade Citgo                6820 Cascade Rd Se     Grand Rapids MI
Dutton Fuel Mart LLC         2560 E Beltline Ave Se Grand Rapids MI
Centerpointe Marathon        2560 E Beltline Ave Se Grand Rapids MI
Shell Food Mart              2600 E Beltline Ave Se Grand Rapids MI
Speedway                     4018 Cascade Rd Se     Grand Rapids MI
Grand Rapids Gas Incorporated3214 28th St Se        Grand Rapids MI
Cascade Shell                4033 Cascade Rd Se     Grand Rapids MI
Speedway                     4665 44th St Se        Kentwood     MI
Super Petroleum Incorporated 2411 28th St Se        Grand Rapids MI
;;;;
run;

proc print data=gasstations;
var addr2geocode;
run;


%rstart(dataformat=csv,data=gasstations,rGraphicsFile=);
datalines4;

   attach(gasstations)

   library(ggmap)

   gaddress <- as.character(gasstations$addr2geocode)
   locs <- geocode(gaddress,output="more")

;;;;
%rstop(import=locs);

proc print data=locs(drop=var2);
run;

The output that is returned from the PROC Print looks like:

                                                 The WPS System                     19:07 Thursday, November 14, 2013    1
                                                                                                                                    
 Obs          lon          lat type             loctype                                                                             
                                                                                                                                    
   1  -85.5396645   42.9127946 street_address   rooftop                                                                             
   2   -85.540616    42.913151 street_address   rooftop                                                                             
   3  -85.5308863   42.9129211 street_address   range_interpolated                                                                  
   4  -85.5678243   42.9128289 street_address   rooftop                                                                             
   5  -85.5692342    42.921498 street_address   range_interpolated                                                                  
   6  -85.5692342    42.921498 street_address   range_interpolated                                                                  
   7  -85.5685794   42.9125298 street_address   range_interpolated                                                                  
   8  -85.5700129   42.9125533 street_address   range_interpolated                                                                  
                                                                                                                                    
 Obs address                                                                                               north        south       
                                                                                                                                    
   1 5189 28th street southeast, grand rapids, mi 49508, usa                                         42.91414358  42.91144562       
   2 5155 28th street southeast, grand rapids, mi 49512, usa                                         42.91449998  42.91180202       
   3 5556 28th street southeast, grand rapids, mi 49512, usa                                         42.91427683  42.91157887       
   4 4045 28th street southeast, grand rapids, mi 49512, usa                                         42.91417788  42.91147992       
   5 2305 east paris avenue southeast, grand rapids, mi 49546, usa                                   42.92284723  42.92014927       
   6 2305 east paris avenue southeast, grand rapids, mi 49546, usa                                   42.92284723  42.92014927       
   7 3960 28th street southeast, grand rapids, mi 49512, usa                                         42.91388553  42.91118757       
   8 3927 28th street southeast, grand rapids, mi 49512, usa                                         42.91389553  42.91119757       
                                                                                                                                    
 Obs         east         west  postal_code country         administrative_area_level_2 administrative_area_level_1 locality        
                                                                                                                                    
   1 -85.53831552 -85.54101348        49508 united states             kent                      michigan            grand rapids    
   2 -85.53926702 -85.54196498        49512 united states             kent                      michigan            grand rapids    
   3 -85.52953782 -85.53223578        49512 united states             kent                      michigan            grand rapids    
   4 -85.56647532 -85.56917328        49512 united states             kent                      michigan            grand rapids    
   5 -85.56787602 -85.57057398        49546 united states             kent                      michigan            grand rapids    
   6 -85.56787602 -85.57057398        49546 united states             kent                      michigan            grand rapids    
   7 -85.56723042 -85.56992838        49512 united states             kent                      michigan            grand rapids    
   8 -85.56866387 -85.57136183        49512 united states             kent                      michigan            grand rapids    
                                                                                                                                    
 Obs street                                  streetNo point_of_interest query                                                       
                                                                                                                                    
   1 28th street southeast                       5189        NA         5189 28th St Se, Grand Rapids, MI                           
   2 28th street southeast                       5155        NA         5155 28th St Se, Grand Rapids, MI                           
   3 28th street southeast                       5556        NA         5556 28th St Se, Grand Rapids, MI                           
   4 28th street southeast                       4045        NA         4045 28th St Se, Grand Rapids, MI                           
   5 east paris avenue southeast                 2305        NA         2305 E Paris Ave Se, Grand Rapids, MI                       
   6 east paris avenue southeast                 2305        NA         2305 E Paris Ave Se, Grand Rapids, MI                       
   7 28th street southeast                       3960        NA         3960 28th St Se, Grand Rapids, MI                           
   8 28th street southeast                       3927        NA         3927 28th St Se, Grand Rapids, MI                                                                                      The WPS System                     19:07 Thursday, November 14, 2013    2
                                                                                                                                    
 Obs          lon          lat type             loctype                                                                             
                                                                                                                                    
   9   -85.556224    42.946438 street_address   rooftop                                                                             
  10    -85.50019    42.915388 street_address   rooftop                                                                             
  11    -85.50019    42.915388 street_address   rooftop                                                                             
  12   -85.499809    42.913584 street_address   rooftop                                                                             
  13   -85.583252    42.916731 street_address   rooftop                                                                             
  14   -85.583252    42.916731 street_address   rooftop                                                                             
  15   -85.583243    42.916087 street_address   rooftop                                                                             
  16   -85.570236    42.947743 street_address   rooftop                                                                             
                                                                                                                                    
 Obs address                                                                                               north        south       
                                                                                                                                    
   9 4591 cascade road southeast, grand rapids, mi 49546, usa                                        42.94778698  42.94508902       
  10 6799 cascade road southeast, grand rapids, mi 49546, usa                                        42.91673698  42.91403902       
  11 6799 cascade road southeast, grand rapids, mi 49546, usa                                        42.91673698  42.91403902       
  12 6820 cascade road southeast, grand rapids, mi 49546, usa                                        42.91493298  42.91223502       
  13 2560 east beltline avenue southeast, centerpointe mall, grand rapids, mi 49546, usa             42.91807998  42.91538202       
  14 2560 east beltline avenue southeast, centerpointe mall, grand rapids, mi 49546, usa             42.91807998  42.91538202       
  15 2600 east beltline avenue southeast, centerpointe mall, grand rapids, mi 49546, usa             42.91743598  42.91473802       
  16 4018 cascade road southeast, grand rapids, mi 49546, usa                                        42.94909198  42.94639402       
                                                                                                                                    
 Obs         east         west  postal_code country         administrative_area_level_2 administrative_area_level_1 locality        
                                                                                                                                    
   9 -85.55487502 -85.55757298        49546 united states             kent                      michigan            grand rapids    
  10 -85.49884102 -85.50153898        49546 united states             kent                      michigan            grand rapids    
  11 -85.49884102 -85.50153898        49546 united states             kent                      michigan            grand rapids    
  12 -85.49846002 -85.50115798        49546 united states             kent                      michigan            grand rapids    
  13 -85.58190302 -85.58460098        49546 united states             kent                      michigan            grand rapids    
  14 -85.58190302 -85.58460098        49546 united states             kent                      michigan            grand rapids    
  15 -85.58189402 -85.58459198        49546 united states             kent                      michigan            grand rapids    
  16 -85.56888702 -85.57158498        49546 united states             kent                      michigan            grand rapids    
                                                                                                                                    
 Obs street                                  streetNo point_of_interest query                                                       
                                                                                                                                    
   9 cascade road southeast                      4591        NA         4591 Cascade Rd Se, Grand Rapids, MI                        
  10 cascade road southeast                      6799        NA         6799 Cascade Rd Se, Grand Rapids, MI                        
  11 cascade road southeast                      6799        NA         6799 Cascade Rd Se, Grand Rapids, MI                        
  12 cascade road southeast                      6820        NA         6820 Cascade Rd Se, Grand Rapids, MI                        
  13 east beltline avenue southeast              2560        NA         2560 E Beltline Ave Se, Grand Rapids, MI                    
  14 east beltline avenue southeast              2560        NA         2560 E Beltline Ave Se, Grand Rapids, MI                    
  15 east beltline avenue southeast              2600        NA         2600 E Beltline Ave Se, Grand Rapids, MI                    
  16 cascade road southeast                      4018        NA         4018 Cascade Rd Se, Grand Rapids, MI                                                                                   The WPS System                     19:07 Thursday, November 14, 2013    3
                                                                                                                                    
 Obs          lon          lat type             loctype                                                                             
                                                                                                                                    
  17  -85.5879549    42.912112 street_address   rooftop                                                                             
  18   -85.569238    42.948355 street_address   rooftop                                                                             
  19  -85.5499342   42.8836312 street_address   range_interpolated                                                                  
  20   -85.607639    42.912997 street_address   rooftop                                                                             
                                                                                                                                    
 Obs address                                                                                               north        south       
                                                                                                                                    
  17 3214 28th street southeast, grand rapids, mi 49512, usa                                         42.91346098  42.91076302       
  18 4033 cascade road southeast, grand rapids, mi 49546, usa                                        42.94970398  42.94700602       
  19 4665 44th street southeast, kentwood, mi 49512, usa                                             42.88497343  42.88227547       
  20 2411 28th street southeast, grand rapids, mi 49512, usa                                         42.91434598  42.91164802       
                                                                                                                                    
 Obs         east         west  postal_code country         administrative_area_level_2 administrative_area_level_1 locality        
                                                                                                                                    
  17 -85.58660592 -85.58930388        49512 united states             kent                      michigan            grand rapids    
  18 -85.56788902 -85.57058698        49546 united states             kent                      michigan            grand rapids    
  19 -85.54858527 -85.55128323        49512 united states             kent                      michigan            kentwood        
  20 -85.60629002 -85.60898798        49512 united states             kent                      michigan            grand rapids    
                                                                                                                                    
 Obs street                                  streetNo point_of_interest query                                                       
                                                                                                                                    
  17 28th street southeast                       3214        NA         3214 28th St Se, Grand Rapids, MI                           
  18 cascade road southeast                      4033        NA         4033 Cascade Rd Se, Grand Rapids, MI                        
  19 44th street southeast                       4665        NA         4665 44th St Se, Kentwood, MI                               
  20 28th street southeast                       2411        NA         2411 28th St Se, Grand Rapids, MI                           
                                                                                                                

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Calculating Driving Distances with WPS and the Bridge to R

A few weeks ago, there was a posting on SAS-L where the poster was attempting to get the driving distance between two cities using google’s mapping services. I found that a rather interesting question and decided to see what I could do using WPS and the Bridge to R.

For those who are unfamiliar with the Bridge to R, it is a product from MineQuest Business Analytics that allows you to execute R statements from within the WPS environment. You can pass WPS datasets to R and return R frames to WPS quite easily. You also get the R log and list files returned to your WPS session in the corresponding log and list windows.

Here is the code that we used to create a driving distance matrix between three cities. The output is printed using the PROC Print statement in WPS. 

*--> data set for drive distances;
data rdset;
input fromdest $1-17 todest $ 20-36;
cards;
Grand Rapids, MI   State College, PA
Columbus, OH       Grand Rapids, MI
Chicago, IL        Grand Rapids, MI
;;;;
run;


%Rstart(dataformat=csv,data=rdset,rGraphicsFile=);
datalines4;

    attach(rdset)
    library(ggmap)

    from <- as.character(fromdest)
    to  <- as.character(todest)

    mydist <- mapdist(from,to)

;;;;
%rstop(import=mydist);

proc print data=mydist(drop=var2);
format m comma10. km comma 8.2 miles 8.2 seconds comma7. minutes comma8.2 hours 6.2;
run;

And this is the output:

      Obs    from                  to                              m          km       miles    seconds     minutes     hours       
                                                                                                                                    
       1     Grand Rapids, MI      State College, PA         843,978      843.98      524.45     28,256      470.93      7.85       
       2     Columbus, OH          Grand Rapids, MI          521,289      521.29      323.93     17,543      292.38      4.87       
       3     Chicago, IL           Grand Rapids, MI          285,836      285.84      177.62      9,695      161.58      2.69       
                                                                                                                             

So you can see how handy WPS and the Bridge2 to R can be as a resource – kind of a Swiss Army knife if you like.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Thursday Ramblings

Does anyone do comparisons of graphics cards and measure performance in a VM? Specifically, do certain graphics cards boost performance when running VM’s on the desktop? I like to see my windows “snap” open when I switch from VM to VM. As a developer, I often wonder if spending an additional $150 on a popular graphics card will yield a perceptible performance boost.

Speaking of graphics cards, we recently bought a couple of used Nvidia Quadro graphics cards from a local CAD/CAM company that is upgrading their workstations. I got these at about 5% of their original retail price so I’m happy. We were having problems getting a couple of servers to go into sleep mode using Lights Out and we discovered that we needed a different graphics card to accomplish this. The plus side is that these are Nvidia cards with 240 CUDA cores and 4GB of RAM. So we now have the opportunity to try our hand at CUDA development if we want. I’m mostly interested in using CUDA for R.

One drawback to using CUDA, as I understand it, is that it is a single user interface. Say you have a CUDA GPU in a server, only one job at a time can access the CUDA cores. If you have 240 CUDA cores on your GPU and would like to appropriate 80 CUDA cores to an application — thinking you can run three of your apps at a time, well that is not possible. What it seems you have to do is have three graphics cards installed on the box and each user or job has access to a single card.

There’s a new Remote Desktop application coming out from MS that will run on your android device(s) as well as a new release from the Apple Store. I use the RDC from my mac mini and it works great. I’m not sure what they could throw in the app to make it more compelling however.

Toms Hardware has a fascinating article on SSD’s and performance in a RAID setup. On our workstations and servers, we have SSD’s acting as a cache for the work and perm folders on our drive arrays. According to the article, RAID0 performance tends to top out with three SSD’s for writes and around four on reads.

FancyCache from Romex Software has become PrimoCache. It has at least one new feature that I would like to test and that is L2 caching using an SSD. PrimoCache is in Beta so if you have the memory and hardware, it might be advantageous to give it a spin to see how it could improve your BI stack. We did a performance review of FancyCache on a series of posts on Analytic Workstations.

FYI, PrimoCache is not the only caching software available that can be used in a WPS environment. SuperSpeed has a product called SuperCache Express 5 for Desktop Systems. I’m unsure if SuperCache can utilize an SSD as a Level 2 cache. It is decently priced at $80 for a desktop version but $450 for a standard Windows Server version. I have to admit, $450 for a utility would give me cause for pause. For that kind of money, the results would have to be pretty spectacular. SuperSpeed offers a free evaluation as well.

If you are running a Linux box and want to enjoy the benefits of SSD caching, there’s a great blog article on how to do this for Ubuntu from Kyle Manna. I’m very intrigued by this and if I find some extra time, may give it the old Solid State Spin. There’s also this announcement about the Linux 3.10 Kernel and BCache that may make life a whole lot easier.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Handling Excel Tables using WPS on Linux

I thought I’d take a moment to make a quick announcement about the Bridge to R as it pertains to Linux. Currently, using WPS on Linux, one cannot read and write Excel files (either .xls or OOXML which is the .xlsx format) using the DBfiles engine. MineQuest has been working on a solution to this and the Bridge to R will support the reading and writing of these two file types and we will be rolling that out in about a week. We need to do a little more testing and write some documentation before we can release it.

So if you require the ability to create Excel Worksheets using WPS on Linux, the Bridge to R for Linux will soon provide support for that functionality.

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a reseller of WPS in North America.

What is in Your BI Stack?

Earlier this week, I was sitting around talking to a few friends at a place called the Tilted Kilt (not a bad place either) about what constitutes their analytics platform in regards to software that they use on a regular basis. One fellow works for a major finance house here in town, the second person works for an educational consortium, and the third works for an advertising agency of about 100 employees.

Pretty much as expected, both the finance and the education employee are stuck with what is “provided” by their employer. In other words, they’re not allowed by the IT group to add any software to the “standard” analytics desktop (for whatever that means). The software for these two folks was pretty straight forward and included an expensive stat package (SAS) and Excel.

Lynn, who works for an ad agency was fairly unique in my view because of the diversity of tools she had at her disposal. She had the standard Microsoft Office install, but also had SPSS, Stata, RapidMiner and R, as well as a data visualization package which I simply cannot remember the name of right now.

I understand that some of the tools that Lynn uses is driven by the fact that they are open source and cost effective, but she’s also one of the smartest data analysts I’ve known for the last six or seven years. It started me thinking about what I use most often and currently, my BI stack consists of:

WPS – a SAS language compatible software application

R – open source statistics and graphics

Bridge to R – interface into the R system for WPS users

Excel – spreadsheet

Ggobi – data visualization

Google Refine – data cleansing

Looking at my list, three of the six software applications are open source.

I’m curious to hear from others on what constitutes your BI stack and whether your organization allows you to augment the software with tools of your choice. I’m especially interested in hearing how your company deals with open source software and if you think that having a choice of tools allows you to think outside the box?

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.

Submitting R Programs Remotely using Dropbox

One of the great software applications currently available is a product called DropBox. DropBox is a piece of downloadable software that allows you to access your files between different computers by dropping a file into your Dropbox folder. Dropbox automatically syncs the files between all the computers that have access to your Dropbox folder. The great thing about Dropbox is that it just works and is smooth as can be.

I’ve been using Dropbox for about two or three months now and thought how great would it be to extend the functionality of Dropbox by being able to place into a specific folder a WPS or R file and have it automatically execute and write the output back into the Dropbox folder. Basically, you would have access to your organizations server for executing programs while travelling or working onsite.

My experimentation with this is under Windows, and I put together a little application that will allow you to remotely submit an R job. On my server, I have a filewatcher program that monitors the DropBox folder of my choosing and when it sees a new R program (i.e. one with a .R extension) it fires up R and processes the program. The system writes back any output to the Dropbox folder so you also have your .lst and .log files to review. You can also directly write output from your program (say an RDataframe file you created) by referencing the folder in your program.

I’ve included a little video of how R and Dropbox can be used to submit R programs on a remote server using a browser and place the output back into a Dropbox folder.

Click here to view a short 02:30 minute video of Drop4R

Of course, you don’t have to use a browser to place the files in the Dropbox folder. You can always just copy and paste or drag and drop the R program into the DropBox folder and the Job Spawner will simply execute the R program.

I’ve created a small zip file that contains a first draft of an installation guide on how you can setup Drop4R on your Windows computers. I’ve made the application freely available and you can use it without any restrictions.

Links:

Installation Guide: Dropbox Guide

Drop4R Installation File: drop4r.zip

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.

Technorati Tags: ,,,,