The last few days, I’ve been testing WPS with some rather large data sets to try to determine what kind of limits there are and to check the stability of the system under heavy load. So far, I’ve limited my tests to doing simple PROCS like Append, Copy, Freq, and Means on WPS data sets. In these tests, I keep increasing the size of the data sets as each batch successfully completes. I’ve now run tests that have contained observations of 1 million, 10 million, 100 million, 1 billion and 2 billion records.
I believe that that the upper limit of a WPS data set on Windows is (2^31) -1, or 2,147,483,647 observations. I’ve tried to append data to go beyond that number and WPS graciously tells me that there is an error, “Too many records in data set xxxxxx” and refuses to load any more observations into the data set. An interesting side note, in discussing the upper bound limit with a colleague, he reminded me that SAS had this same limitation at one time as well.
I’ve never had an opportunity at any site that I’ve worked at to process such large amounts of data. I have consulted to a client that had 800 million credit card accounts and 32 billion transactions in their data warehouse (different vendors, private label cards, etc…) but actually analyzing a single portfolio never exceeded 160 million records.
Running these tests can be time consuming. Getting my head around the size of these files is something that does not come naturally to me. For example, going from a dataset of 1 million records to 2 billion records means the data set is 2,000 times larger. I don’t have the fastest Windows Server in town (I do have a lot of storage though) and I just basically let it chug while I work on other projects. But these tests do illustrate that WPS is capable of handling large files.
The real reason I’ve been performing such tests is to determine if WPS can replace SAS to do the heavy lifting for which it is so often used. At this point, I believe the answer is yes. For those companies that are interested in using WPS for a backend for development and implementation of vertical market applications, database access and web enablement, and not have to contend with the sky-high licensing fees requested by SAS Institute, this is something to have on your short list for evaluation. For those organizations that have a need to do daily heavy lifting of data, sorting, summarization and reporting, WPS can fulfill most of your needs in replacing SAS/Base, SAS/Access and SAS/IntrNet.