Category Archives: Linux

Ubuntu 16.04 Released and Quick Test Drive

In the last week, Canonical has brought forth a new release of Ubuntu and it is pretty nice! Version 16.04 has a number of great features that should be of value to those who use Linux. One thing that Ubuntu has at this point is a vertical line of products. I can’t think of any other vendor who has an OS that runs on Phones, tablets, notebooks/workstations, servers and mainframes.

I decided to give it a try on one of my workstations running it in an Oracle Virtual Machine (Virtualbox to be specific) to see how WPS runs on this new release. Just to cut to the chase, it runs quite well. As a matter of fact, once I got the VM to use all of its allotted storage, WPS ran like a charm.

clip_image002

A couple of things that might be of interest to potential Ubuntu upgraders. First, Ubuntu 16.04 supports ZFS. That might be important to a few sites. The second is the support for LXD 2.0. From the Ubuntu website –

LXD 2.0

Ubuntu 16.04 LTS includes LXD, a new, lightweight, network-aware, container manager offering a VM-like experience built on top of Linux containers.

LXD comes pre-installed with all Ubuntu 16.04 server installations, including cloud images and can easily be installed on the Desktop version too. It can be used standalone through its simple command line client, through Juju to deploy your charms inside containers or with OpenStack for large scale deployments.

All the LXC components – LXC, LXCFS and LXD – are at version 2.0 in Ubuntu 16.04 LTS.

In addition to trying Ubuntu 16.04 in a VM, I have also tested it on a small server (6 LCPU with 32GB of RAM) running WPS. Although I have not benchmark tested this exhaustively, it does appear that using v16.04 with WPS 3.3.2 (which is the latest release) provides a modest performance increase. This is easily observed with multi-threaded Procedures such as Means and Summary.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in beautiful Tucson Arizona. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is an authorized reseller of WPS in North America.

Richmond, CA Hackathon – Meeting of the Minds

On the weekend of October 17-18th the Meeting of the Minds Civic Hackathon will take place in Richmond, California.  Amongst the various tools and facilities that will be available for the Hackathon, World Programming will be providing WPS software (www.teamwpc.co.uk/products/wps ) and support for any SAS programmers taking part in the event who would like to create and run programs in the language of SAS. The WPS software will be available on a server provided by Cisco and also for installation onto your own workstations running Linux, OS X or Windows.  Teams who use WPS software at this event will be given a license at no cost, and can use the product and all of its features for an additional 6 months after the event.

There will be data sets that can be used to create civic oriented applications and the data is categorized into Economic Development, Public Spaces, Health and Environment, Sustainability, Digital Divide and Education. So there is plenty of data available for a myriad of subject matter experts to use.

There is a $5,000 cash prize from Qualcomm awarded to the winner.

More information on the Hackathon can be found at: http://tinyurl.com/p6ymuot

 

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Macro Catalog Compatibility

Here’s something rather interesting that I discovered earlier today. If you create and compile a macro catalog on say Windows, you can simply copy that catalog onto Linux or Mac OS X. The compiled catalog is now accessible on all the x86 WPS supported platforms.

Think about how important that can be. If you are a developer and want to be sure that your catalogs are portable across x86 platforms, then you are in good shape with WPS. Think of the cost savings. With WPS, you could create compile and distribute on x86 systems. In contrast, our competitor would require you to purchase a Linux and Windows version of there software to do the same.

‘nuff said!

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Some thoughts on a rainy Monday

The more I use Linux, the more I come around to understand just how much I can do with it. As a matter of fact, I could easily do without Widows and switch 100% of the way over to Linux if I wanted. The desktop(s) and business applications have really gotten that good.

Windows 8 just soured me on the whole MS ecosystem. When they bolted on the Metro interface on a server OS — that was the last straw for me. Who ever made that decision to strap on a touch interface to a server should be let go. Shown the door. Asked to leave…

I have Apple hardware here in the office, and it runs well, but I just have not been able to embrace it like so many others have. Apple makes some fine hardware and there’s a load of support for Office productivity applications as well as analytical apps. WPS runs quite well on OS X as well as R. As a matter of fact, I see a lot of R users who work on OS X as there preferred platform.

But Linux, and specifically Ubuntu 12.04 and 14.04 have been especially good. I don’t have memory issues when I run large simulations in R that require a lot of RAM. With Windows, that is often a problem, trying to allocate a large block of memory and there’s not sufficient contiguous memory to hold a large array, vector or data frame. The memory management is significantly different under Linux than under Windows.

Use of NVidia’s CUDA framework seems to be predominantly used on Linux and not Windows. I’m not sure why that is to be honest.

I’ve been reading a lot of articles stating that MS is working feverishly trying to get Windows 9 out the door. No doubt (at least in my mind) it has to do with the terrible Metro interface and people staying away in hoards. Of course, you can slap Start8 by Stardock on Windows 8 and it makes it useable by implementing the start button, and kudos to Stardock for doing such a thing, but I still can’t find a way to embrace MS on the desktop any longer.

An interesting phenomena that I have been witnessing is how much analytical and scientific development has been happening over the years on the Linux platforms. There are a lot of tools out there that are helpful if you are a data scientist or working with “BIG DATA” as it pertains to Linux. My experiences in reselling WPS is that there is an equal amount of interest (perhaps more) in using Linux on servers than in running Windows servers. Cost is one factor but performance is also a factor. Linux often out performs Windows Servers dollar for dollar and CPU second to CPU second.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

A Summer Project

One of my summer projects is building and performance tuning a relatively inexpensive analytics server. Many of the parts that are being used have been scavenged from another server or two that have been retired. One thing I want to do this summer is report on what I have discovered in performance tuning a modest server.

The server consist of a six core AMD processor and 16GB of RAM to start out with. I would like to experiment with different combinations of RAM, hard drives, hard disk controller cards and perhaps an SSD or two. The OS will be Linux, Ubuntu 14.04 specifically.

My baseline build has just two work drives in RAID-0 and use the SATA 3 ports on the motherboard. I will use the Workstation Performance Assessment Program that I wrote about back in 2012. I’ve slightly modified that program so that it doesn’t spew output in the listing with the exception of the actual performance benchmark.

One thing I have already learned is that you need to make sure that you have the Write Cache enabled. In Ubuntu, you would do this by going to Disks and clicking on the options button at the top right of the dialog box and then selecting Drive Settings. Simply select the Write Cache and click on Enable Write Cache. You will need to do this for each disk in the raid array.

ubuntu_disk_cache

When I enable the write cache, my timings for the PROCs and data steps that took place on data sets that existed on the work array dropped 35%. That’s a big improvement!

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Thursday Ramblings

Does anyone do comparisons of graphics cards and measure performance in a VM? Specifically, do certain graphics cards boost performance when running VM’s on the desktop? I like to see my windows “snap” open when I switch from VM to VM. As a developer, I often wonder if spending an additional $150 on a popular graphics card will yield a perceptible performance boost.

Speaking of graphics cards, we recently bought a couple of used Nvidia Quadro graphics cards from a local CAD/CAM company that is upgrading their workstations. I got these at about 5% of their original retail price so I’m happy. We were having problems getting a couple of servers to go into sleep mode using Lights Out and we discovered that we needed a different graphics card to accomplish this. The plus side is that these are Nvidia cards with 240 CUDA cores and 4GB of RAM. So we now have the opportunity to try our hand at CUDA development if we want. I’m mostly interested in using CUDA for R.

One drawback to using CUDA, as I understand it, is that it is a single user interface. Say you have a CUDA GPU in a server, only one job at a time can access the CUDA cores. If you have 240 CUDA cores on your GPU and would like to appropriate 80 CUDA cores to an application — thinking you can run three of your apps at a time, well that is not possible. What it seems you have to do is have three graphics cards installed on the box and each user or job has access to a single card.

There’s a new Remote Desktop application coming out from MS that will run on your android device(s) as well as a new release from the Apple Store. I use the RDC from my mac mini and it works great. I’m not sure what they could throw in the app to make it more compelling however.

Toms Hardware has a fascinating article on SSD’s and performance in a RAID setup. On our workstations and servers, we have SSD’s acting as a cache for the work and perm folders on our drive arrays. According to the article, RAID0 performance tends to top out with three SSD’s for writes and around four on reads.

FancyCache from Romex Software has become PrimoCache. It has at least one new feature that I would like to test and that is L2 caching using an SSD. PrimoCache is in Beta so if you have the memory and hardware, it might be advantageous to give it a spin to see how it could improve your BI stack. We did a performance review of FancyCache on a series of posts on Analytic Workstations.

FYI, PrimoCache is not the only caching software available that can be used in a WPS environment. SuperSpeed has a product called SuperCache Express 5 for Desktop Systems. I’m unsure if SuperCache can utilize an SSD as a Level 2 cache. It is decently priced at $80 for a desktop version but $450 for a standard Windows Server version. I have to admit, $450 for a utility would give me cause for pause. For that kind of money, the results would have to be pretty spectacular. SuperSpeed offers a free evaluation as well.

If you are running a Linux box and want to enjoy the benefits of SSD caching, there’s a great blog article on how to do this for Ubuntu from Kyle Manna. I’m very intrigued by this and if I find some extra time, may give it the old Solid State Spin. There’s also this announcement about the Linux 3.10 Kernel and BCache that may make life a whole lot easier.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

I’ve Grown Weary of Windows

This blog post is going to be a rant.

I’m so frustrated with Windows Server 2012 R2 that I can spit nails. Who in their right mind at Microsoft thought changing the interface on a SERVER to what is used in Windows 8 was a good idea? If you want to do any real administrative work on the server it is just a nightmare.

I’ve played with Windows 8 on a desktop and didn’t care for it and decided that Windows 7 was so much better for productivity. The mixture of a tablet OS and a desktop OS is just a disaster. In my opinion, MS not only missed the boat, but continues to ignore the market place as it centers on business users.

Going forward, I’m going to start recommending that clients use Linux on their servers and just forget about using Windows Server products. It just isn’t worth the hassle and with the number of talented Linux users and administrators growing every day, there isn’t any upside anymore to using Windows on the Server. There are incredible cost savings in both dollars and time using Linux instead of Windows.

My own thoughts on the server is that Linux is faster than Windows. You don’t have all that eye candy eating up resources. Linux is faster, more robust and has virtually the same number of databases available that you have under Windows. The exception being SQL Server. If you have to run SQL Server than put it on the smallest box possible and minimize your exposure to Windows. There are many databases that you can use on Linux that will fill the void of SQL Server. For example, DB/2, Oracle, MySQL, MariaDB, PostgreSQL, Teradata, Vertica, Sybase, SAND, Netezza, Kognitio, Informix, and Greenplum all run on Linux x86. And the kicker is that all of the above DB’s are supported and accessible from WPS.

I’m also starting to review and reconsider my position of Windows on the desktop. If Windows 9 is the abortion that Windows 8 (and 8.1) continues to be, then you can bet that I will start using Linux on my desktop or (God forbid that I’m saying this…) OS X. I talk to a lot of analytics users and this is something that we all agree on. I need to be productive at work and I’m more productive with Linux and OS X than with Windows 8.

That’s the bottom line.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Creating a WPS Launch Icon in Ubuntu

I use Ubuntu for my WPS Linux OS and it’s pretty easy to install. However, unlike the vast majority of people out there who run it in batch mode; I like to run it in interactive mode using the Eclipse Workbench. Hence I want an icon that I can click on to start WPS. Here’s how to do it.

On the Ubuntu desktop, right mouse click on an empty part of the screen and you will get a little option menu. Click on “Create Launcher…” You will see a dialog box pop up that looks like:

clip_image002

On my Ubuntu Linux Server, I installed WPS into a folder named wps-3.0.1. The directions below use that folder name as our example. You may have installed WPS into another folder so be sure to consider that when performing the tasks below.

Name: WPS 3.01

Command: /home/minequest/wps-3.0.1/eclipse/workbench

Comment: WPS 3.0.1 Linux

Click on the icon on the upper left hand of the Create Launcher Dialog Box (the little spring) and you will get a choose icon list box. Simply go to the WPS install folder and go into the eclipse folder. There you will find a file named icon.xpm. Click on icon.xpm and then click Open and then click OK.

That’s all there is to it. You should have the WPS icon installed and available from your desktop.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

Analytical Data Marts

Recently, there has been a conversation on what defines “Big Data”. It’s my position (among others) that Big Data is data that is so large that a single computer cannot process it in a timely manner. Hence, we have grid computing. Grid computing is not inexpensive and is overkill for many organizations.

The term “Huge Data” has been bandied about as well. In the conversations regarding what is Big Data, it was sort of agreed that Huge Data is a data set that sits somewhere between 10GB and 20GB in size. (Note: In about two years I will look back at this article and laugh about writing that a 20GB data set is huge for desktops and small servers.) The term Big Data is so abused and misused by the technical press and even many of the BI vendors that it’s almost an irrelevant term. But Huge Data has my interest and I will tell you why.

The other day I read a blog article on the failure of Big Data projects. The article talks about a failure rate of 55%. I was not surprised by that kind of failure rate. I was surprised that there were not solutions being offered. In the analytics world, especially in finance and health care, we tend to work with data that comes from a data warehouse or a specialized data mart. The specialized data mart is really an analytics data mart with the data cleaned and transformed into a form that is useful for analysis.

Analytical data marts are cost effective. This is especially true when the server that is required is modest compared to the monsters DB’s running on large iron. Departments can almost always afford a smaller server and expect and receive much better turnaround time on jobs than most data warehouses. Data marts are more easily expandable and can be tuned more effectively for analytics. Heck, I’ve yet to work on a mainframe or large data warehouse that could outrun a smaller server or desktop for most of my needs.

The cost for a WPS server license on a four, eight or even sixteen core analytics data mart is quite reasonable. With WPS on the desktop and a WPS Linux server, analyst can remotely submit code to the data mart and receive back the log, listings and graphics right back into their desktop workbench. But the biggest beauty of running WPS in your data mart platform is that WPS comes with all the database access engines as part of the package. If you have worked in a large environment with multiple database vendors, you can see how this can be very cost effective when it comes to importing data from all these different data bases into an analytical data mart.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.

test, Test and TEST

I’m always amazed and somewhat peeved about how much error checking one has to have in their SAS language program. Even for the simplest things. So today’s mantra is test, Test, and TEST!

I was writing some code the other day to copy a WPS data set from my PC to a server share. The basic code is really quite simple, only three lines using PROC COPY. But what if the user has misidentified the source libname or the destination libname? Do you just let it blow up and hope the user looks at the logs? And then you have the data sets to be copied if you are using the SELECT statement. Do you check if the data set already exist and if so, just overwrite the file?

Although it was trivial, albeit time consuming to write the code to check for these conditions, it is well worth it. I purposely decided not to automatically overwrite an existing data set on the server. And that is good for two reasons. First, I want the user to be forced to make the decision to overwrite the data set by use of a PROC DATASETS or PROC DELETE before the copy takes place. That makes it their responsibility to delete the data set.

Secondly, I found out that writing the data set with the same name can sometimes create problems under Windows when the server folder is shared. I have had some experiences where Windows locks the file on the server and the copy never takes place. The copy procedure just hangs with a .lck extension on the file. So something is going on where it’s just not reliable.

One interesting thing to note, I don’t seem to have the problem with a lock on Linux. The copy takes place without issue every single time.

About the author: Phil Rack is President of MineQuest Business Analytics, LLC located in Grand Rapids, Michigan. Phil has been a SAS language developer for more than 25 years. MineQuest provides WPS and SAS consulting and contract programming services and is a authorized reseller of WPS in North America.