Fun with functions

I’ve been programming with SAS for over a decade now, and am a very active user on the community. Partly because there’s so much to know. Today alone I learned two new function that I didn’t know or had forgotten :).

  1. NLITERAL() function
    • SAS allows you to name variables freely with the validvarname=any option. However this means you need to refer the variable with quotes around the name and a n at the end to differentiate it from a character string, i.e. ‘Variable Name’n.  Formatting a macro variable in this structure can be cumbersome as you need a quote, concatenate and a trim function to ensure that spaces are removed.  Then you need to make sure the quotes don’t interfere in your code and resolve properly.  The NLITERAL function takes care of all of this by properly defining it as a literal, and it’s smart enough to only do it when required!
  2. CHAR() function
    • The char function allows you to isolate characters in a string. When you need to loop over a string this is incredibly useful.
  3. UPPER() vs %UPCASE() function
    • I’m familiar with this function, but what I didn’t know was that if you use it when querying a dictionary table in SAS, it slows things down, so instead you should use the %UPCASE() function.

Another day, another piece of SASsy knowledge.

How do I write a macro to…split my data set into multiple files?

A very common question on forums and help boards appears to be “How do I write a macro to do XYX?”. Often times a macro isn’t required. This is part one of an ongoing series into how to accomplish a task WITHOUT using a macro.

The first example answers the question of how to Split a data set into multiple files. Any file generated using a file statement can be generated using this method. I will be using the FILEVAR option of a file statement to split the SASHELP.CARS data set into multiple text files, one for each Make. The process and code is below, hope you find it helpful!

This is a two step process:

  1. Sort the file
  2. Generate the output using a Data Step

PROC SORT DATA=SASHELP.CARS OUT=CARS;
BY make;
RUN;

DATA _NULL_;

SET cars; *Dataset to be exported;
BY make; *Variable that file is to be split on;

*Create path to file that is to be exported;
if first.make then out_file=cats(‘/folders/myfolders/’, trim(make));

file temp filevar=out_file dlm=’,’ dsd;

*If first value of make then output column names;
if first.make then
put ‘Make, Model, MPG_HIGHWAY, MPG_CITY’;

*Output variables;
put make model mpg_highway mpg_city;

run;


Pseudocode AKA Stop and Think

From Wikipedia:

Pseudocode is an informal high-level description of the operating principle of a computer program or other algorithm.  It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading.

My definition:

A way to force you to think through your program before coding.  Really thinking through what variables you’ll need, what outputs you’ll need. It helps further on with the design. It allows me to work through some of the decision points that I’ll need to program later on, such as:

  • Should I use a macro variable or by groups?
  • Do I need to keep around the results from every simulation or just the end results? Which results do I need?
  • Where do I need counters?
  • Do I need break logic or define my loop with a do while instead?

This is an ideal situation, and I admit I don’t do it all the time. Sometimes experience allows me to skip this step, and sometimes I think experience allows me to skip this step and I shouldn’t.  There are many times, when I wish I hadn’t.

But basically, its a step that says, STOP AND THINK. Because thinking before doing makes things go faster. It’s also a great way to create a program structure that you can then pass on to someone else to actually code, if that’s an option.