SAS – Access to Software

If you’re learning to use SAS, whether at school or home, having access to the software on your own is a benefit. SAS has three different methods that you can use to access their Base software for free. This includes access to the SAS Base + SAS/STAT + SAS/IML for free. Some of the Time Series procedures are not available. I’m not sure why the Time Series procedures are in their own package and not part of SAS/STAT.

The three different methods to access SAS for free are:

SAS University Edition – via download
SAS University Edition – via AWS
SAS Academics on Demand

Regardless of which method you choose, you will be interacting with SAS via SAS Studio which is a web based method of interacting with SAS. The browser is used to send commands to a server which sends the results back to your computer. In the downloaded version, the server is set up ‘virtually’ on a virtual machine on your computer. In AWS and AoD, the server is either on Amazon’s server or SAS’s servers. AWS can incur a charge, and will after 1 year after your initial set up. This is an Amazon restriction, free useage of their micro tier for 1 year. The charge estimates for Canada were approximately $8.64 a month
if the server was kept running 24/7. If you remember to suspend the instance when not using the charge will be less.  Other charges can be incurred if you transfer large data sets to AWS. AoD and SAS UE on desktop have no charge.

SAS UE that runs on your desktop ensures your data stays locally. However, if you work on
a multitude of computers, ie work, home, school and want access to your programs and data anywhere a cloud solution is your best option.

You have more control over the SAS on your computer, as you can set the RAM and make sure your computer is not working on anything else, if required. Your files are always available as text files on your desktop.

SAS provides some basic support for all of these versions via support.sas.com and communities.sas.com

Disclosure: This post was not paid for and nor did I receive any benefits from SAS from posting.

SAS and Literals

There are three types of literals in SAS that are frequently used. They typically follow the format of being quoted and having a letter at the end.

Date Literal (d)

When hard coding in a date into your code or creating a date, you need to specify it in the DATE9. format, and add a d at the end. For example:

date_var = ’31Jan2016’d;

You aren’t required to put in a 4 year date, but I highly recommend it. Another way to reference a date without using a literal is to use the MDY function. 1,

x = MDY(1, 31, 2016);

With the dates specified in this format you can use it in your code, either assign it to a variable or use it for comparison. The following two lines of code will result in the same thing

if x=’31Jan2016’d then do;

if x=mdy(1, 31, 2016) then do;

Datetime Literal

Along those same lines, you can also specify a datetime variable in SAS, except this time you use the letters dt.

datetime_literal= ’31Jan2016:04:20:00’dt;

Again you can also use a function, in this case, its the DHMS() function.The first parameter in this case is date, followed by hour, minute and seconds.

datetime_function = dhms(’31Jan2016’d, 4, 20, 0);

Name Literal

The last one is more rare to see, because SAS usually doesn’t allow it. EG does have it allowed by default. SAS variable names have restrictions, such as no spaces and they cannot start with numbers. If you want to ignore those restrictions you can set the option validvarname. The default value is V7.

 

option validvarname=any;

With this option you can create a variable that has a name such as ‘My Variable’

To reference it, you then use the letter n to distinguish it from a character string.

‘My Variable Name’n

 

Trust in Data Science

Trust – a person on whom or thing on which one relies (Dictionary.com)

Data Science is a difficult field.

There isn’t always certainty in the answers, methods or recommendations. What works today may not work tomorrow or in a new field. A new algorithm speeds things up, new information changes the best method, best practices change. Staying on top of all of that is most definitely a challenge. Being a CEO, CIO, or even a Chief Data Scientist is a daunting task. Which is why the people who do the actual day to day work need to be trust worthy. As the CEO or a colleague, I need to trust that you know what you’re doing and that what you recommend is something I can go along with. I may not always be in a place to verify every detail of your work, or sometimes even understand it.

So if you’re a douchebag online, insult people online, make sexist remarks in the office at the water cooler or the corporate jet you can’t really be trusted in my opinion. In general, when you can’t trust someone you don’t trust the work they do.

So don’t be a douche bag. Online or in real life. Is that really so difficult?

Fun with functions

I’ve been programming with SAS for over a decade now, and am a very active user on the community. Partly because there’s so much to know. Today alone I learned two new function that I didn’t know or had forgotten :).

  1. NLITERAL() function
    • SAS allows you to name variables freely with the validvarname=any option. However this means you need to refer the variable with quotes around the name and a n at the end to differentiate it from a character string, i.e. ‘Variable Name’n.  Formatting a macro variable in this structure can be cumbersome as you need a quote, concatenate and a trim function to ensure that spaces are removed.  Then you need to make sure the quotes don’t interfere in your code and resolve properly.  The NLITERAL function takes care of all of this by properly defining it as a literal, and it’s smart enough to only do it when required!
  2. CHAR() function
    • The char function allows you to isolate characters in a string. When you need to loop over a string this is incredibly useful.
  3. UPPER() vs %UPCASE() function
    • I’m familiar with this function, but what I didn’t know was that if you use it when querying a dictionary table in SAS, it slows things down, so instead you should use the %UPCASE() function.

Another day, another piece of SASsy knowledge.

Sample Work

SAS Presentations


Reporting from BASE SAS – Tips & Tricks (2011)

  • Indexes, Proc Means (Ways/Types), ODS Tagset Intro

Creating a TOC Using CSS (2011)

  • Creating a Table of Contents in Word document by styling text in standard Word formats and then using the auto format feature in Word.

Coordinating Complex Reports in SAS (2014)

Avoid Macros (2015)

  • FAQ for macro’s
    • Importing multiple files
    • Export multiple files
    • Splitting a SAS dataset
    • Calling Macro multiple times
  • Sample Code

Reports with Style from Excel

  • Examples of using ODS TAGSETS.EXCELXP features

Windows 7 and SAS 9.3

  • Highlight some new options available under Windows 7 and in SAS 9.3

Sample SAS code library on GitHub


Tableau

How do I write a macro to…split my data set into multiple files?

A very common question on forums and help boards appears to be “How do I write a macro to do XYX?”. Often times a macro isn’t required. This is part one of an ongoing series into how to accomplish a task WITHOUT using a macro.

The first example answers the question of how to Split a data set into multiple files. Any file generated using a file statement can be generated using this method. I will be using the FILEVAR option of a file statement to split the SASHELP.CARS data set into multiple text files, one for each Make. The process and code is below, hope you find it helpful!

This is a two step process:

  1. Sort the file
  2. Generate the output using a Data Step

PROC SORT DATA=SASHELP.CARS OUT=CARS;
BY make;
RUN;

DATA _NULL_;

SET cars; *Dataset to be exported;
BY make; *Variable that file is to be split on;

*Create path to file that is to be exported;
if first.make then out_file=cats(‘/folders/myfolders/’, trim(make));

file temp filevar=out_file dlm=’,’ dsd;

*If first value of make then output column names;
if first.make then
put ‘Make, Model, MPG_HIGHWAY, MPG_CITY’;

*Output variables;
put make model mpg_highway mpg_city;

run;


Who Benefits From Open Data

I’ve been involved in Open Data from late 2009 when Edmonton (#yeg) first started it’s Open Data initiative. Since then, I’ve participated in hackathons and MeetUps in both Surrey and Vancouver while living outside of Alberta. After moving back to Alberta, I’ve once again started to participate in the Open Data initiative within the City of Edmonton. While definitely in it’s infancy I believe this group does have a lot to offer to the citizens of Edmonton, and the current leadership appears motivated to get there. FYI, I’m involved, but not in a leadership role, I sit/listen/comment/criticize while others work.

Anyways, lately I’ve been thinking about this whole initiative and wondering who actually benefits from Open Data. And the answers I’ve come up with are below. Feel free to add your own in the comments.

1. Government
Open Data relies on the government first off organizing its data in a useable form, which is hard to believe many actually don’t always do. Being forced to organize the data is self-preservation for all the FOIP and accountability measures that are being brought up.
Government also benefits because it now has access to data from different Ministries and different levels of government (Federal, Municipal, Provincial) without having to jump through hoops that could have included signing data sharing agreements, memorandums of understanding that could sometimes take weeks if not years to hash out at a significant cost in terms of both employee and productivity.

2. Businesses
Releasing Open Data freely means not having to pay for access to data that once cost a bundle. Additionally, it also means that there is now data that they didn’t even know existed that they can use in conjunction with proprietary data to enhance business services.
Using Open Data isn’t as easy as downloading a data set most of the time, so there are also business opportunities for skilled analysts and companies that are willing to use these data products to provide new services to companies that need this information.

3. Citizens
In general, Open Data provides a benefit to every day citizens by providing access to information that would otherwise not be be available. But to be honest, knowing how much a government employee get’s paid, the demographics of my neighbourhood, or the type of tree in front of my house have a fairly limited benefit to my overall daily quality of life. The benefits of the data will come from how the government and business use the data.
There is currently a lot of movement around citizen engagement and citizens using the data to produce apps and new information, but is this a permanent movement or a temporary fad with the sudden access to information? Also, with citizens there is no incentive to continue to update these products save for general interest and for many people life has more demands than the data waiting on a website. Like my puppy that’s staring at me waiting for a walk.