Category Archives: Workbench

Edge Analytical Processing

There’s been a lot of work going on in the BI space for the last few years. Some of it interesting, some not so much. But one thing I find very intriguing is what I call Edge Analytics. Think of Edge Computing applied to business and governmental analytical processing.

EA Processing has many facets and I want to cover just a couple of them in this blog post. So exactly what is EA Processing and what advantages does it bring to my organization? EA Processing, in a broad sense allows a company to process most of their data locally. When the demand is high, meaning lots of jobs consuming lots of CPU cycles and I/O bandwidth, the company has the option of firing up a server on the cloud to extend processing capacity. This might only last a few hours or be something that occurs at month end or year-end processing.

Using EA Processing negates having a secondary server (or sets of servers) on premises that doesn’t take up data center space, electricity and maintenance. EA Processing is useful for allowing consultants and contractors to have access to compute power in a highly controlled environment and that can theoretically be more secure than on premise analytic servers.

When viewed broadly, EA Processing can also serve as a center for disaster recovery. With the hurricanes that hit Houston, Texas and ravaged most of the State of Florida, it has become apparent to many that having EA Processing capability is an important feature to consider when building out your BI stack.

Finally, consider this example of EA Processing that is not truly cloud based but does use remote connectivity to do basically the same thing. Say your company has offices in Munich Germany as well as Los Angeles, California. Your organization has data scientist and data analyst at both sites. The time difference between Munich and Los Angeles is eight (8) hours.

The BI staff in Munich can have access to all the computing power on the analytics servers in Los Angeles since most of the staff isn’t even up or at work when the Munich staff are at the office. The Los Angeles BI Staff (for the most part) also have access to the Munich servers since the Munich staff are already out the door heading home while they are work.

Both locations have access to multiple servers to fill in the need when high process demand from one site or another is required. Both locations have control of their local data, work under an identical security model and each have low latency when running local jobs.

With WPS, you can easily configure remote processing with the workbench. You can use WPS Link to run your jobs on any WPS Server. You can also use WPS Communicate to run jobs or parts of jobs on any WPS Server.

When you have multiple WPS Servers, it’s invaluable to keep them busy. Most companies in my experience have servers sitting idle for a good part of the day. Analysts (and I’m speaking from experience here) get really frustrated and testy having to wait for jobs to execute simply because resources are not properly set up to server the users in the most expedient and cost-conscious way.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in beautiful Tucson, Arizona. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS consulting and contract programming services and is an authorized reseller of WPS in North America.

Adventures in Porting

We’ve been busy porting the new version of the Bridge 2 R over to both the Mac and Linux platforms from Windows. The Windows release of the Bridge always allowed for the use of the Bridge from within the Eclipse Workbench. WPS didn’t have the Workbench as the GUI on the Max or Linux until version 3 which is the latest release. So, here’s what I found porting a large program that has to talk to different operating platforms (i.e. calls to the OS) for such things as delete files, move files, copy files, read directories, etc… and still interface with R.

The mundane part of porting was converting a lot of “\” to “/” throughout the code. In retrospect, we could have done a better job writing the Bridge in the first place to accommodate these conventions, but we didn’t have the intention of porting code back then either.

Here’s a couple of the gotcha’s that we experienced. When you read a directory on Linux or OS X, the structure is slightly different between the two and you have to accommodate that issue. The other BIG issue is that the pathnames are much longer on Linux and OS X when reading and writing to the WPS work folders. We ended up resizing our string variables to handle that specific difference.

The above might sound trivial but one think we discovered is that when you restart your server on OS X and Linux, the new work folder is contained inside the previous folder. For example, your original folder, let’s call it work1 is now hosting work2, your new folder. Now the path name is /work1/work2. But in reality, the names of the work folders are not work1 or work2 but long strings that can be hundreds of characters long. If you have a user who likes to restart their WPS Server, you can eat up a lot of string space quickly.

Since we store a lot of metadata for the Bridge 2 R inside the work folder, R has to be able to cope with very long filenames and I’m not convinced that it really copes all that well. Speaking of file names, here’s another anomaly between Windows and Linux/OS X systems. if you have a filename such as “myfile.txt ” (note the blank space at the end of .txt) Windows handles that just fine. Windows will interpret that as meaning you wanted “myfile.txt” However, if you write such a file or try to read a file with that name under Linux or OS X, then those two names are distinctly different. On Ubuntu or Fedora, that name shows up as “myfile.txt\” when you list the files from the terminal.

It took us about three days to port the Linux version of the Bridge over from Windows. Much of that time was spent dealing with the issues in the previous paragraph. We then took the ported Linux code and tested it on OS X. It took about 20 minutes to modify the section dealing with the difference in reading directories between the two platforms, and we then had a new version of the Bridge to R running on OS X.

In retrospect, porting the code over to Unix/Linux systems was worth the effort. It took a few days for us to do the porting and much of that was due to being naive about the new ported destinations. I will talk soon about the new enhancements (and a programming change users will have to make) in the Bridge to R in my next post.

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.

WPS Tips – Start a New Session

As most of you who have been using WPS via the Eclipse Workbench already now, you can submit a WPS program and instantly start modifying code, view the log window or listing window without impact to your existing session. You can even start a new program editor all the while your WPS program is running in the background. Try that with the SAS DMS!

One thing you can’t do is submit a second WPS program while you are running a program in an existing session. But here’s a neat trick that allows you to run a second program simultaneously and take advantage of all that horsepower of your PC. By simply starting a new WPS session you can run another WPS program while your first one in session #1 is executing.

To do this, simply go to the main menu and click on Window and then select New Window. A new WPS session will start-up with all your existing programs in the Project Explorer Window

 

image

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.

WPS Tip – Turn ON or OFF Line Numbers

 

Here’s a quick tip for you. Did you know that you can turn on or off line numbers in the editor via a pop-up menu?

From the editor window move your mouse to the left margin area and then click your right mouse button. A pop-up menu will appear that will provide you with the ability to turn ON or OFF line numbers. There’s also a Preferences selection that you can click to set other parameters such as color coding and the number of spaces for a tab.

 

 

EclipseLineNumbers

 

About the author: Phil Rack is President of MineQuest, LLC. and has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and a reseller of WPS in North America.

 

Technorati Tags: ,,,