All posts by Phil Rack

Phil Rack, Owner of MineQuest Business Analytics, LLC. Grand Rapids, MI USA For more than twenty five years, I've worked as a SAS Consultant, specializing in the financial industry. My motives for this blog are to inform and educate other consultants as well as clients who use SAS or WPS and how they can more effectively use technology to further their business objectives.

Product Overview WPS for Workstations v3.3

We just updated our Product Overview document entitled WPS for Workstations v3.3. This document explains the features and breadth of the WPS Workstation product for version 3.3.

The document contains a list of all the database engines and procedures that are included in the workstation product. For organizations considering WPS, this is a good place to start to understand the WPS offerings on OS X and Windows.

You can access the document by clicking here (1.4mb pdf)

or at: http://minequest.com/downloads/WPS-for-Workstations-Marketing-Brochure.pdf

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in beautiful Tucson Arizona. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS consulting and contract programming services and is an authorized reseller of WPS in North America.

WPS v3.3 Now Available

On Thursday December 15th WPL introduced WPS v3.3. This new version is available for immediate download. With v3.3, WPS includes a slew of new Procedures that will be of great value to those who hold WPS licenses and those who are looking to convert over to WPS from SAS.

New Language Procedures

Matrix Language Support is now available with PROC IML. PROC IML is included as a standard procedure in WPS and is not an additional cost module. There is a 430 page programming guide in PDF format that is included in the installation folder detailing how to use the Matrix Programming Language.

Python Support is now included in WPS with PROC PYTHON. PROC Python allows a WPS user to create, edit and invoke python programs from within WPS. The implementation of PROC PYTHON is very similar to PROC R. PROC PYTHON is included in WPS and is not an additional cost module.

ODS Support

WPS now includes output to PDF as well as HTML and Text output destinations. Note that PDF support is available on all platforms except z/OS at this time.

New Statistical Procedures

PROC ACECLUSProvides two methods for approximating the within-cluster covariance structure for a clustering model under the assumption of equal multivariate Gaussian distributed clusters.

PROC CANCORR – Identifies and measures the associations among two sets of variables.

PROC GENMOD – its generalized linear models.

PROC LIFEREG – Fits parametric, accelerated failure time models in the presence of left-, right- and interval censored data.

PROC LIFETEST – Estimates non-parametric survival functions in the presence of censored data using Kaplan-Maier or actuarial methods.

PROC LOESS – Fits non-parametric regression surfaces to multi-dimensional input data. The smoothness of the non-parametric model can be controlled. Outliers in the input data are detected.

PROC MI – Imputes the values of missing values in an input dataset.

PROC MIXED – Fits a mixed linear model to input data.

PROC MODECLUS – Produces various cluster output statistics.

PROC PHREG – Fits the Cox proportional hazards model to survival data.

PROC PROBIT – Fits binary or ordinal response regression models, useful for dose-response type analysis. Various types of model are supported by the procedure. Parameter estimates are generated through the use of maximum likelihood estimation. Model fit statistics enable the quality of the generated model to be assessed.

PROC VARCOMP – Fits generalized linear models with random effects, where the associated covariance matrix is assumed to be diagonal.

Note that WPS Statistics is included in the cost of a WPS license and is not a module that needs to be licensed separately at an additional cost.

New Graphics Procedure

PROC GBARLINE – The GBARLINE Procedure has been added to WPS. This procedure allows you generate bar charts on which plot data has been overlaid on to the bar chart.

New Data Engine

XLSX Engine – This is a cross platform engine that provides read and write access to file in Microsoft Excel format. You can process Excel data on any platform you choose and are no longer limited to Windows platforms. The XLSX engine is included in WPS and is not an additional cost module.

Data Engine Enhancements

NETEZZAM -Is a replacement engine for the NETEZZA Engine. NETEZZAM provides for multi-threaded operation using a new architecture enabling significant performance increases. The NETEZZAM engine is included in WPS and is not an additional cost module.

ORACLEM – Is also a replacement for the ORACLE Engine of prior releases. ORACLEM is also multi-threaded bringing performance increases. The ORACLEM engine is included in WPS and is not an additional cost module.

Both the above engines provide for the ability to Bulk-Load data.

There are a number of additional language features and workbench features that are worth investigating as well. WPS v3.3 is a major release where the functionality and language and procedures have been augmented.

For a list of all the WPS Procedures and Database Engines that are currently supported in v3.3, you can download a two-page brochure from MineQuest. This brochure lists the database engines that are supported on the Linux, OS X and Windows platforms as well as language support and PROC Support.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in beautiful Tucson Arizona. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS consulting and contract programming services and is an authorized reseller of WPS in North America.

The Application Economy

I pretty much finished up my Christmas shopping two weeks early this year. Even the wrapping and delivery completed thanks to Amazon this year. I’ve never had my shopping done so early in December and I’m darn happy about that!

That gave me time to watch some TV this weekend and since much of College Football is over for the season I ended up cruising over to Bloomberg TV. I watched a program called “Hello World” on the Russian Tech scene and it was fascinating to learn about what was being created in Russia.

The sponsor of the show was CA (aka Computer Associates) and they had an interesting and entertaining commercial titled “The Front Porch” which is about the Application Economy. We as analytical developers rarely think about software as an application the same way as consumers do. Our customers are often different departments or divisions in the corporation we work at. We don’t work at creating an application product that meets the needs of tens-of-thousands of users, or even millions. We mostly develop products used for tens of people or if we are lucky, hundreds.

A lot of the reason for that is that many of us don’t see what we do as developing an application that is consumed by users outside of our organization. The cost of commercial software is often so high that it makes it cost prohibitive to invest the hundreds of hours needed to create the application. The other issue many run into is the availability of data that can meet the needs of the consumer and is not protected by agreements.

The market has responded with software such as Python and R. However, the problem with both is the amount of data that can be processed. We live in a Big Data world and expecting data to fit into available memory is often not practical. Many of us are also dependent on using the Language of SAS for processing and displaying of data.

Obviously, WPS is a better choice than SAS when it compares to pricing, especially on the desktop. If you create an application that requires, say, WPS on a workstation, it is much easier to make a sale (your application and a WPS license) when the first-year cost is one-tenth the cost of the SAS system.

In future articles, I want to touch on creating applications for resale using WPS. I want to talk about “applications” for such things as Smart Cities, Marketing, Credit Scoring and Fraud Analytics.

We truly live in an era where we as analysts and statistical developers can contribute our skills starting a business, providing a product and doing it all with minimal cash outlay. The internet is a money pipe into the home and business. Don’t let the opportunity pass you by.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in beautiful Tucson Arizona. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS consulting and contract programming services and is an authorized reseller of WPS in North America.

Disruptive Analytics

I picked up Disruptive Analytics available on Amazon which is Thomas Dinsmore’s recent book a few days ago and thought I would leave my impressions. Note this is not a review! First, I really enjoyed the history of the analytics platforms. The second and third chapters were very informative (History and Open Source respectively) and I learned a few things!

Regarding Open Source, I agree that we will see Python supplant R as the “go to language” for analytics in the Open Source arena. It might take a few years but if my customers interests are indicative of this trend, it will happen.

Dinsmore does an admirable job in Chapter 4 on Hadoop. This chapter is fairly dense reading for me mainly because there are a lot of terms and definitions in this chapter. If you were ever looking for an overview of the Hadoop ecosystem, this is probably a good start.

The other chapter I really liked was Chapter 6. This chapter deals with streaming analytics and I believe we are just in the infancy of this revolution. Smart Cities will be a very visible platform for many people to see and benefit from streaming analytics.

I would like to see in a future edition a presentation of the role of the analytics workstation and flash memory in the analytics framework. Data Scientist who are developing algorithms and processing data are often using workstations in lieu of servers. Perhaps even a few pages on how nVidia is revolutionizing the analytics world with CUDA processing on high power workstations. I think I would enjoy that.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in beautiful Tucson Arizona. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is an authorized reseller of WPS in North America.

Ubuntu 16.04 Released and Quick Test Drive

In the last week, Canonical has brought forth a new release of Ubuntu and it is pretty nice! Version 16.04 has a number of great features that should be of value to those who use Linux. One thing that Ubuntu has at this point is a vertical line of products. I can’t think of any other vendor who has an OS that runs on Phones, tablets, notebooks/workstations, servers and mainframes.

I decided to give it a try on one of my workstations running it in an Oracle Virtual Machine (Virtualbox to be specific) to see how WPS runs on this new release. Just to cut to the chase, it runs quite well. As a matter of fact, once I got the VM to use all of its allotted storage, WPS ran like a charm.

clip_image002

A couple of things that might be of interest to potential Ubuntu upgraders. First, Ubuntu 16.04 supports ZFS. That might be important to a few sites. The second is the support for LXD 2.0. From the Ubuntu website –

LXD 2.0

Ubuntu 16.04 LTS includes LXD, a new, lightweight, network-aware, container manager offering a VM-like experience built on top of Linux containers.

LXD comes pre-installed with all Ubuntu 16.04 server installations, including cloud images and can easily be installed on the Desktop version too. It can be used standalone through its simple command line client, through Juju to deploy your charms inside containers or with OpenStack for large scale deployments.

All the LXC components – LXC, LXCFS and LXD – are at version 2.0 in Ubuntu 16.04 LTS.

In addition to trying Ubuntu 16.04 in a VM, I have also tested it on a small server (6 LCPU with 32GB of RAM) running WPS. Although I have not benchmark tested this exhaustively, it does appear that using v16.04 with WPS 3.3.2 (which is the latest release) provides a modest performance increase. This is easily observed with multi-threaded Procedures such as Means and Summary.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in beautiful Tucson Arizona. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is an authorized reseller of WPS in North America.

WPS v3.2.2 Released

 

Earlier in the Week, World Programming released an update to WPS. Version 3.2.2 is mainly a maintenance release with a number of fixes.  There are some improvements and the two that caught my eye are:

25591: WPS can now handle record lengths up to, or even greater than 32K when writing to SAS7BDAT files.

25596: WPS on Linux now supports Sybase IQ 16.0 client drivers.

There are number of other fixes that are probably more important than the two I chose above (especially if you are on MVS).

You can upgrade your installation by going to the WPS Website and logging into the download servers (User ID and Password required.) You can also read a list of all the changes by clicking on the change log file on the right hand side of screen.

 

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in beautiful Tucson Arizona. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is an authorized reseller of WPS in North America.

Introducing WPS Express

Today, World Programming LTD announced the availability of WPS Express. WPS Express is a product for those interested in learning the Language of SAS. WPS Express comes with all the database drivers and other modules of the Standard desktop version of WPS.

What separates WPS Express from the Standard Edition desktop experience is the number of records that can be processed. Currently, WPS Express processes 100 records.

What WPS Express is meant be is a free product that allows you to learn the Language of SAS. As such, 100 records are probably sufficient to learn to program in the language, connect to many different databases, and run R.

One other caveat is that WPS Express is licensed to an individual and not to any organization. Again, it’s worth noting that this is a product to learn how to write code in the Language of SAS. Also, WPS Express is licensed on an annual basis so you will have to renew your license every year.

You can find WPS Express by going to the World Programming website and taking a look at: https://www.worldprogramming.com/try-or-buy/wps/editions/express

If you are interested in a more formal WPS training, especially on how to use the WPS Workbench, I recommend that you reach out to Art Tabachneck. Art has a placement company called Analyst Finder that helps companies and recruiters find analytical talent. Art also has a one-day training seminar and he can do the training online. I’ve seen the syllabus and slide deck and think it’s quite complete with regards to getting a thorough understanding of the power of WPS. Interested parties can reach out to Art at: art297@rogers.com

WPS Express, due to its 100 record limitations is not a practical product to use for evaluating whether to swap out SAS for WPS. Every organization would need the standard edition to process an unlimited number of records so that they could compare output of the products.

MineQuest Business Analytics is able and willing to help you and your organization with your evaluation of WPS. We can arrange for a free 30-day evaluation of the workstation products, both OS X and Windows as well as on all supported server platforms.

Interested in a quote or a free 30-day evaluation of the standard edition of WPS? If your organization is located in North America, simply fill out the Evaluation Request from our website.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in beautiful Tucson Arizona. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is an authorized reseller of WPS in North America.

PROC REG WPS v3.2–New Graphics and PMML

So, those of you who have downloaded WPS v3.2, there are a number of new features. I want to show two new features using PROC REG. WPS now has the ability to create plots for PROC REG. Quite handy indeed!

Also, in Proc REG for v3.2, we see experimental support for PMML (Predictive Model Markup Language).

Here is some sample code that demonstrates the plots.

*–> Data is census population data from 1790 to 2010;
data census;
   input year pop @@;
   pop2 = Round(Pop/1000000,.1);
   popsq=pop2*pop2;
   lpop=lag(pop2);
cards;
1790 3929214 1800 5308483 1810 7239881 1820 9638453 1830 12860702 1840 17063353
1850 23191876 1860 31443321 1870 38558371 1880 50189209 1890 62979766 1900 76212168
1910 92228496 1920 106021537 1930 123202624 1940 142164569 1950 161325798
1960 189323175 1970 213302031 1980 236542199 1990 258709873 2000 291421906 2010 308745538
;;;;
run;

*–> PROC REG with the PMML attribute to output the model in PMML form.;

filename outfile ‘c:\temp\regpmml.txt’;
Proc Reg data=census outpmml=outfile pmmlver=”4_2″ plots;
model pop2 = year lpop;
Title “US Census Population – PROC REG”;
run;

 

US Census Population – PROC REG
The REG Procedure
Model: MODEL1
Dependent variable: pop2

Number of Observations Read 23
Number of Observations Used 22
Number of Observations with Missing Values 1

Analysis of Variance
Source DF Sum of Squares Mean Square F Value Pr > F
Model 2 206768 103384 9307.59 <.0001
Error 19 211.04266 11.10751    
Corrected Total 21 206979      

Root MSE 3.332793 R-Square 0.998980
Dependent Mean 111.704545 Adj R-Sq 0.998873
Coeff Var 2.983579    

Parameter Estimates
Variable DF Parameter Estimate Standard Error t Value Pr > |t|
Intercept 1 -299.75395 71.30929 -4.20 0.0005
year 1 0.16607 0.03878 4.28 0.0004
lpop 1 0.97176 0.02754 35.28 <.0001

ResidualPlot2

DiagnosticsPanel3

 

The PMML output generated is:

<?xml version=”1.0″ encoding=”utf-8″ ?>
<PMML version=”4.2″ xmlns=”
http://www.dmg.org/PMML-4_2″>
    <Header copyright=”World Programming Limited 2002-2015″>
        <Application name=”World Programming System (WPS)” version=”3.2.0″/>
    </Header>
    <DataDictionary numbeOfFields=”5″>
        <DataField name=”year” optype=”continuous” dataType=”double”/>
        <DataField name=”pop” optype=”continuous” dataType=”double”/>
        <DataField name=”pop2″ optype=”continuous” dataType=”double”/>
        <DataField name=”popsq” optype=”continuous” dataType=”double”/>
        <DataField name=”lpop” optype=”continuous” dataType=”double”/>
    </DataDictionary>
    <RegressionModel functionName=”regression” targetFieldName=”pop2″>
        <MiningSchema>
            <MiningField name=”year”/>
            <MiningField name=”lpop”/>
            <MiningField name=”pop2″ usageType=”target”/>
        </MiningSchema>
        <RegressionTable intercept=”-299.753951850233″>
            <NumericPredictor name=”year” coefficient=”0.166074316077245″/>
            <NumericPredictor name=”lpop” coefficient=”0.971762137737628″/>
        </RegressionTable>
    </RegressionModel>
</PMML>

Interested in a free 30 day evaluation of WPS? If your organization is located in North America, simply fill out the Evaluation Request from our website.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in beautiful Tucson Arizona. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

CleanWork for Windows

Recently, we decided to go back through some of our older programs and take a look at them and see if they could be updated and/or made open source. We wrote Cleanwork years ago and we often provided it to organizations that used our consulting services as a freebie and a way to say “Thank You.”

CleanWork does pretty much what the name says. It is a WPS program that when run, will clean out the work folders of old and orphaned directories that are no longer used. WPS comes with a cleanwork program for Linux and Mac but not for Windows. The version written by MineQuest will run on Windows Workstations running Vista, 7, 8, 8.1 and 10. It will also run on Windows Servers such as Windows Server 2008, 2008 R2, 2012 and 2012 R2. Basically, it will run on all Windows Servers except 2003 and before. It also runs on all Windows Workstations except XP and before.

Cleanwork is packaged in a zip file that contains the source code, the Usage Document, License and a sample program. Cleanwork has been tested to execute only on the WPS platform.

If you are running WPS on a Windows Server you may want to set cleanwork to run on a schedule. This is a perfect utility to automate and run on a regular schedule. For busy server installations, I could see setting a scheduler to run cleanwork every few hours.

The zip file contains five files. These are:

clean.sas – a sample program for running the cleanwork utility.

cleanwork_source.sas – the actual source code that implements the utility.

CleanWorkUsage.docx – a Microsoft Word document that explain how to use cleanwork.

SASMACR.wpccat – a compiled version of the macro that  is ready to run.

license.txt – The license agreement for use of the source code and user document.

You can find the download by going to the bottom of the page here.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

 

 

 

New Software added to the Stack

 

Here is some cool software that I’ve started using during 2015.

Places

Here is a cool tool if you find yourself pushing data all around. From server-to-server, cloud-to-server or anything in between. CoffeeCup software has a nifty utility called Places. It can read and write to Amazon Cloud Services, OneDrive, Box, Dropbox, Google Drive, Instagram and Flickr. I picked it up on a weekend sale for $9. Well worth it.

CrashPlan

Crashplan is one of the best pieces of software we have at the house. We back up our Windows tablets, PC’s and Mac’s onto a small PC with a large hard drive. It’s very easy to install and there are options that allow you to back up the target PC on to a portable drive or even to the cloud (which is an additional expense, but still quite reasonable.)

Microsoft Office 16

I use Office all day long. I love it and Office 16 has raised the bar even further. I’ve been loathe at using OneNote but have finally started to use it since it syncs so well across so many of my devices. I use Word and Excel extensively and really don’t see a single issue since I upgraded from the previous versions.

Skype – Skype is just about the best communications system I use. I make phone calls, video calls and text. Buying a subscription with a phone number gets you one step further towards being able to work remotely and not having to use a damn cell phone.

Skype is improving and becoming more robust with each iteration. New features seem to be aimed at the enterprise market but I suspect we will see some of these trickle down to the small business market very soon. The ability to do a web conference similar to Webex will be a big boon for small business customers and software developers working from home.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.