The Future Job Market in BI Analytics

As a consultant, I often get questions from employees at where I consult on where I see the job market going and how it is shaping employment opportunities. I see a few different tiers or levels where jobs call for specialized skills and where there is some room for growth.

Data Analysts and Report Specialists
This is probably the most common job opportunity out there. In the finance world, and banking in particular, you soon realize that these organizations run on reports and forms. True, many of these reports are now web reports, but reports are reports. Data Analysts, in my opinion, are the driving force behind the success in market research, direct mail campaigns, etc… I’ve seen more data analysts move from intro and mid-level positions to management positions than from any other position. After a few years, a data analyst is so valuable because he/she knows where the data is sourced and often knows the business side better. They understand how the data moves, what it means, and the impacts it has to the business. They often understand it at a much more basic level and better than the management they report to.

Spreadsheet and OLAP Jocks
Typically low and mid level managers are the clients for these services. They typically understand plots, graphs and percentages. The data comes from just a few tables and understanding the importance of the sourcing of the data is often ignored. That is, until a problem arises.

Statisticians and High-End Modeling
I have mixed feelings about the importance of statisticians in the private sector, particularly in regards to what often passes off as Business Intelligence. I know so many companies who dread hiring a PhD statistician. Their fears, generally well founded, are that these hires will grow bored and not be able to focus on the problem at hand. My observations are that they will start looking for their next position within a few months of being hired. I know this is generalizing, but, I see this often in the private sector. As a matter of fact, this line of thought is so prevalent, there are articles warning on hiring PH.D’s when in the start-up mode. You may want to “Google” the terms “Eric Sink” and “Ph.D”  for more detailed thoughts on this.

The other issue playing into the long term employment of Statisticians is how they can be most effectively used. I see a number of mid-sized companies that only use them as consultants. So, they bring them in, pay them to develop the model, and write code to execute the model. After that, programmers or Data Analysts engineer the code to run in a production environment. I see Statisticians taking on a mercenary role in the future.

Business Intelligence Administrators
This is a fairly recent phenomenon in the BI world. As these systems get more complex, specialized skills are being sought to help install, maintain, and trouble shoot, these systems. The BI Admin also instructs users on how to use these systems. Due to the complexity of some of these systems, the skill sets are pretty high for these jobs. The good news is that these jobs pay well, especially for consultants. Training is expensive but you really have no choice. For the most part, you cannot get OJT (On the Job Training) for installation.

I’ll reserve my rants for a future blog article on how these BI companies purposefully make installation so damn difficult by limiting their software to using only certain open-source software or choose to support platforms that are not the most popular platforms. Can you say, “Apache, Linux, and Websphere?”

Sending E-mail with WPS

I’ve been busy writing and documenting the macro library that we plan to offer to WPS users on MS Windows platforms. Actually, this is coming along nicely but writing the documentation is taking a bit more time than I had originally anticipated.

One new program that I wrote and the macro code to interface into calling the external program might be of interest to many others. If you are running MS Outlook as your e-mail client, this program will allow you to send e-mails with attachments from WPS. This is useful if you use your e-mail system as a report distribution system.

The program is called SendToOutlook and is written in C#. This is not my first C# program but I would venture to say it’s one of two useful ones that I’ve written! I’m a Delphi programmer (I like Object Pascal since it’s a language that doesn’t use curly braces) so I tend to struggle with the terminology that .NET uses. Thanks to Alan Churchill at Savian for some pointers on writing this code and translating Delphi terminology into .NET terminology.

What I find amazing is how much you can do with the Express Editions of Microsoft’s .NET languages. It was a piece of cake to include the references and assemblies for interfacing into Outlook. There’s so much free code and documentation on C# that most programmers could pick up C# and be somewhat productive in a short period of time. Just to note, the MS Express Edition are free to download and use. You just need to register with MS to get a key that enables the software.

At any rate, SendToOutlook mimics the functionality of SAS’s e-mail interface. With SendToOutlook, you can include attachments, create Subject Lines, include a distribution list, and add your own text in the body of the e-mail. It does have a similar fault to SAS’s implementation however. That is, Outlook’s security pops up a dialog box requesting confirmation and approval to allow an external program to send an e-mail. You can use the free program ClickYes to overcome this behavior and it’s a very good program for dealing with just this kind of thing. It would just be nice to be able to skip that step. There are also other ways to overcome this problem including modifying registry settings and by making changes on your Exchange Server. It’s best to search the web to find out how to implement these modifications.

We intend to include SendToOutlook in our suite of macros and add-ons to WPS.

Links
Savian: www.savian.net
ClickYes: http://www.contextmagic.com/
Microsoft Express: http://msdn.microsoft.com/vstudio/express/default.aspx

Announcing SAS to WPS Conversions Services

SAS to WPS Conversion Service
As some of you are probably aware, I and a few associates are pretty excited about WPS. If you have visited our web site in the last few days, you have seen where we are now gearing up to offer a new services and a product for those who are considering converting to WPS from SAS or are just looking for support from knowledgeable third parties.

We will help your organization in converting to WPS by doing code reviews, program updates, rewriting of code as necessary and the use of our library of macros that will help your transition go more smoothly. We can also assist you in sizing your server(s) and recommend hardware configurations to get the most out of your investments.

Product Suite of Macro’s
The product suite of macros and programs will help fill in some missing areas that your code may currently need to function. For example, we have developed macros that emulate the Mort functions, FIPS lookup functions, Zip lookup functions (planned), database specific Bulk Load code generators, some statistical and math functions, and a small family of OS specific functions for Windows. We also have macros that emulate PROC RANK and PROC STANDARD as well as a callable external program that is basically FSLIST on steroids.

Support Agreements
MineQuest will also offer Support Agreements for companies and organizations that have converted to WPS. We can provide short term or long term support whether it’s for production code or development where we help your developers’s trouble shoot code and provide possible solutions.

Product Integration Assistance
MineQuest will also offer WPS integration assistance and services on the Windows Platforms. We can help provide performance tuning, design, development and troubleshooting when it comes to getting your software and users working together.

Links: www.minequest.com
www.teamwpc.co.uk/

WPS Data Sets to Replace SAS’s U.S. City File

Just a quick note today, after all, it is the weekend! One of the things that SAS ships with their installation package is a data set that contains some demographic information on US Cities that was obtained from data sources at the Census. Although it’s of limited use because it contains so little data, it does serve the purpose of having a data set immediately available to play with.

I just created a couple of similar data sets for WPS that contains much more demographic information than the SAS dataset. These data sets are based on the US 2000 Census. The first data set is at the county level and contains 3,219 observations, one for each U.S. County. The other data set is a City-Places WPS data set that has 25,375 records. These data sets contain age and gender counts as well as some housing information.

You can find these files at my company’s web site at: www.minequest.com. Click on the “Downloads” button on the left to get to the data. If you are running WPS on a mainframe and want a copy of this data, please contact me at: PhilRack@MineQuest.com.

I hope these files are useful. Enjoy.

Links: www.minequest.com

Load and Stability Testing WPS

The last few days, I’ve been testing WPS with some rather large data sets to try to determine what kind of limits there are and to check the stability of the system under heavy load. So far, I’ve limited my tests to doing simple PROCS like Append, Copy, Freq, and Means on WPS data sets. In these tests, I keep increasing the size of the data sets as each batch successfully completes. I’ve now run tests that have contained observations of 1 million, 10 million, 100 million, 1 billion and 2 billion records.

I believe that that the upper limit of a WPS data set on Windows is (2^31) -1, or 2,147,483,647 observations. I’ve tried to append data to go beyond that number and WPS graciously tells me that there is an error, “Too many records in data set xxxxxx” and refuses to load any more observations into the data set. An interesting side note, in discussing the upper bound limit with a colleague, he reminded me that SAS had this same limitation at one time as well.

I’ve never had an opportunity at any site that I’ve worked at to process such large amounts of data. I have consulted to a client that had 800 million credit card accounts and 32 billion transactions in their data warehouse (different vendors, private label cards, etc…) but actually analyzing a single portfolio never exceeded 160 million records.

Running these tests can be time consuming. Getting my head around the size of these files is something that does not come naturally to me. For example, going from a dataset of 1 million records to 2 billion records means the data set is 2,000 times larger. I don’t have the fastest Windows Server in town (I do have a lot of storage though) and I just basically let it chug while I work on other projects. But these tests do illustrate that WPS is capable of handling large files.

The real reason I’ve been performing such tests is to determine if WPS can replace SAS to do the heavy lifting for which it is so often used. At this point, I believe the answer is yes. For those companies that are interested in using WPS for a backend for development and implementation of vertical market applications, database access and web enablement, and not have to contend with the sky-high licensing fees requested by SAS Institute, this is something to have on your short list for evaluation. For those organizations that have a need to do daily heavy lifting of data, sorting, summarization and reporting, WPS can fulfill most of your needs in replacing SAS/Base, SAS/Access and SAS/IntrNet.

The WPS Eclipse Configurable IDE

One of the cool things about WPS is the IDE. The environment that hosts WPS is the Eclipse IDE and i’s quite configurable. I’ve written about this in a previous blog posting but I think it’s important to contrast this open environment with the closed IDE that SAS has. With Eclipse, you can use “Plug-in’s” to expand the capability of the IDE. I downloaded XML Buddy’s Eclipse plug-in and installed it by simply unzipping the files into the plug-in directory. When I fired up WPS and clicked on File||New||Other, I was presented with an option to use the XMLBuddy editor. I can use that to Edit, Validate XML as well as generate DTD’s. With the Professional version of XML Buddy, you can do even more.

Here’s a screen shot that shows the XML Drop Down menu and the XMLBuddy Editor in WPS.

The WPS Libname Engine

One other thing I wanted to mention. WPS can read SAS data sets natively! No longer do you have to go through the torturous and excruciating pain of exporting your SAS data sets when running under Windows to some bizarre format so it can be read in elsewhere. WPS cannot write a SAS data set. To read a SAS dataset using WPS, just create a libname statement with the SAS data set engine name. For example, if your SAS data set is in the directory D:\SASDATA you can write:

LIBNAME sasdsets sas7bdset “D:\SASDATA”;

And presto, you have access to SAS data. That’s a great feature when you are dealing with users and companies who are in a mixed environment and are using both SAS and WPS.

Links:
WPS – www.teamwpc.co.uk

XMLBuddy – www.xmlbuddy.com

WPS Beta5 Release

Well I finally got my hands on the latest WPS beta release (beta5) and see a lot of improvements and fixes. With this release, I’ve been stress testing it with large data sets and with a few exceptions (ODBC related) performance and stability is quite good.

I’ve been focusing on writing macros to implement PROC RANK and PROC STANDARD in WPS. Both of these procs are absent in the latest release but I’ve reached a point where I’m nearly done with these macros. Once I finish these macros, I can do about 95% of what I can do in SAS/Base as a consultant and developer.

I’m really getting excited about this product for a number of reasons. First, as a consultant and SAS developer, I see the opportunity to expand my client base. Many small and medium size businesses simply cannot afford to license SAS at the rates the Institute demands. WPS gives these smaller companies a shot at implementing SAS like solutions without breaking the bank. With more companies being able to afford these solutions using WPS, I believe my consulting practice will grow with this market expansion.

I’ve also read comments that WPS doesn’t have the statistical procs that SAS does. In my 20 plus years as a SAS consultant, I’m still amazed at how much of the work I do is performed in just Base and Access. Unless you’re a statistician, you will probably never miss those procs anyway.

I also like the fact that there’s competition in this arena. With competition, I feel that things will only get better. There certainly will be more improvements with an eye towards the developers (and not just the CIO and the non-stop kissing up of management) that are the guns in the trenches. That’s why I’m excited about WPS, it has placed its focus on the developer.

Watch out SAS Institute. There’s a value player in the market now!

Links: http://www.teamwpc.co.uk

A Look at the WPS IDE

I’ve received some requests asking to see what the development environment looks like for WPS on Windows. I guess that’s fair considering all the complaining I did in a previous blog posting on how terribly retro SAS’s environment is.

Below are some screen captures that show the WPS editor. Note that the IDE uses the Eclipse environment. The more I use it, the more I like it. It’s fast and fairly intuitive to use. Windows are nicely laid out and easy to find.

The first screen capture is of the IDE and the Editor. I have a SAS program in the editor that reads the SF3 Census file.

The second screen below is a browse window that has been opened onto the WPS data set that was just created.

The third screen capture shows the output window with a proc means that was just run. Note that I didn’t set up any page size or line size options.

And finally, the fourth screen capture that shows the WPS LOG window.

As you can see from the screen shots, and if you are a SAS developer, this should all look pretty familiar to you. The environment is easy to use and friendly and should be easily adapted too by any programmer.

I’ll be posting more about WPS, the SAS/Base alternative in the next few days.

Stay tuned!

Links: http://www.teamwpc.co.uk

Looking for SAS 9 BI Consultants

I was recently contacted by a recruiter who is in immediate need for SAS consultants with experience installing and configuring SAS 9 BI. This is not a permanent position; however, due to the # of projects that they have won, the consultants could work a long time on multiple back-to-back projects. The consultants can be located anywhere as long as they can travel. There are projects across various cities in the US. They are looking for several consultants since a couple of the projects are concurrent. Lastly, they are open to working with the consultants preference of a W2, 1099 or Corp-to-Corp, as long as the $ makes sense.

The client is looking for a qualified SAS 9 EBI (Enterprise Business Intelligence) Consultants who have a good mix of the following Skills and Experiences.

Specialists in SAS Institute software:

* Business Intelligence
* Statistics
* Analytics
* Data Warehouse
* Data Mining
* OLAP – MDDB
* Healthcare
* Pharmaceuticals; and
* Application Development.

In-depth experience with:

* Program Management
* Project Management
* Business Systems Analysis
* Business Process Analysis
* Software Development
* Software Quality Assurance
* Utility Deregulation; and
* Telecommunications.

If you are interested, please shoot me an e-mail (PhilRack@minequest.com) with your phone number and e-mail address and I will be happy to provide the introductions.

Phil Rack
www.minequest.com

Thoughts on Benchmarking SAS and WPS

In a previous post, I stated that I wrote some benchmarks comparing SAS to WPS to get an idea of how they stack up against each other. After giving this some additional thought, I’ve decided that this is not fair to WPS, mainly because it’s still in beta. I think a comparison should be made with the release versions of both packages.

So, with the knowledge of those at the World Programming, I will write some code that will perform some simple benchmarks between the two systems when WPS goes live (i.e. not a beta release) and will compare and contrast the times for the data step and a few PROC’s such as Means, Sort, Freq and Tabulate. I’ve also agreed to first let World Programming review the source code that will be used and will entertain comments from them about how well written the code is and if it’s a justifiable benchmark.

I will post the code and the data sets that I use on my website (www.minequest.com) so that others can review the code and can comment and critique the source code as well. This is not designed to be an exhaustive benchmarking between the two but will provide those interested a simple basis for comparison.

Links: http://www.teamwpc.co.uk
http://www.sas.com
http://www.minequest.com