Introduction
Scheduling SAS programs to run on a UNIX system can be accomplished in a number of ways. I developed a Perl script to manage the process of running SAS code, parsing the SAS log, and emailing the results. Instead of scheduling the SAS executable to run directly, I schedule the Perl script and it takes care of the rest. Specifically, it does the following:
- It checks to make sure that the SAS .sas program file that you want to run actually exists,
- It runs the SAS program in batch mode and creates datetime-stamped .log and .lst files in the same directory as the .sas file,
- It checks the SAS log file for errors, warnings, uninitialized variable messages, and to see if the program ran all the way to the end, and
- It sends an email reporting what happened, along with attached copies of the .log file and the .lst files.
The three ingredients of the solution are:
- A .sas program (tweaked very slightly),
- The Perl script called runsas.run, and
- The crontab UNIX scheduling utility.
Ingredient #1: The .sas Program File
Assuming you have a UNIX .sas program file in hand, add the following line to the very end of it.
data test
do x = 1 to 10;
output;
end;
run;
*make a dataset that counts from 1 to 10;
data test
do x = 1 to 10;
output;
end;
run;
%put FINISHED;
Ingredient #2: runsas.run
I wrote the runsas.run Perl script to coordinate the SAS program execution, log parsing, and emailing process. The full text of the script is at the end of this post. I'm going to assume that you have the code of runsas.run (copied from the end of this post) saved in a text file named runsas.run stored in your home directory (i.e., ~/runsas.run) and that you have granted execution rights on that file by running chmod +x ~/runsas.run.
- The fully specified name of the program (including which directory it's in -- and P.S. please do NOT have spaces in the directory or file names!) and
- your email address (or comma-delimited list of email addresses -- with no spaces).
Here's what runsas.run does:
- It ensures that the program exists,
- It runs the SAS program in batch mode and creates datetime stamped .log and .lst files in the same directory as the .sas file,
- It checks the SAS log file for errors, warnings, uninitialized variable messages, and to see if the program ran all the way to the end, and
- It sends an email to the provided email address indicating how the program ran, along with a attached copies of the .log file and the .lst file (if a lst was produced by SAS).
Ingredient #3: crontab
Crontab is the built-in UNIX job scheduling utility. There are numerous websites devoted to it such as http://crontab.org/ and http://www.adminschoice.com/crontab-quick-reference. Do a web search for "crontab reference" to find others if you wish to learn more beyond what will be explained here.
Getting Results
After runsas.run runs, you should get an email. The subject line indicates if the program had warnings, errors, etc. and the contents of the email summarizes that information. Any errors, etc. that occur are included in the email and the .log and .lst file (if there is one) are attached. Warnings considered to be "false alarms" are listed in the "Ignored error/warning lines" section of the email. The original .log and .lst files are on UNIX in the SAS program directory. The datetime stamp (with time measured on a 24-hour clock) prevents reruns from overwriting prior .log and .lst files. If your program works, but you forgot to put in the %put FINISHED; line at the end, the email will say so as well. And if your program runs with errors, warnings, and/or uninitialized variables notes, the email will tell you what the errors, etc. were.
Full Text of runsas.run
#!/usr/bin/perl use strict; use warnings; use lib "~/perl/lib"; #Custom add-on libs (contains MIME::Lite) use MIME::Lite; #Email module found in custom add-on lib/MIME #See http://search.cpan.org/~rjbs/MIME-Lite-3.028/lib/MIME/Lite.pm use POSIX qw(strftime); #POSIX strftime format returns time formatted as a string #################################################################################################### # Purpose: this perl script requires one command line parameter that is the name of the SAS program # (.sas filename extension not required and path of program should not be included) that # should be executed. A notification email is sent with feedback on how the SAS program # performed once the program has finished. ##################################################################################################### my $num_args = $#ARGV + 1; #Get # command line parameters into a var if ($num_args == 2) { #Make sure that 2 command line arguments provided. my $start_time = time; #Record the current time (time this process started) my $datetimestamp = strftime "%Y%m%d_%H%M%S", localtime; #Create datetime stamp string #The first command line argument is the name of the SAS program to be executed my $sas_prog_file = $ARGV[0]; #Put first command line arg into a variable chomp($sas_prog_file); #Trim trailing spaces unless (-e $sas_prog_file) { die "SAS program ($sas_prog_file) does not exist!"; } #Make sure program exists $sas_prog_file =~ /^(.+\/)(\S+)(\.sas)$/i; #Match with 3-part regex: 1=path, 2=program name (w/o extension), 3=filename extension my $prog_path = $1; #First part is the program path my $prog_title = $2; #Second part is the program file title, last part is filename extension which isn't used #The second command line argument is the email address(es) to whom notification should be sent my $email_recipient = $ARGV[1]; #Put second command line arg into a local variable chomp($email_recipient); #Trim trailing spaces #Start a string that will contain notes about how the process went (process log) and will get #sent as the body of the notification email my $notes = sprintf "Start time: %s\n", scalar(localtime($start_time)); #Put start time first in the notes #Derive names of datetime-stamped SAS log and lst files from the SAS program file path and title my $sas_log_file = "$prog_path$prog_title.$datetimestamp.log"; #chomp($sas_log_file); #Necessary for some reason to remove newline character my $sas_lst_file = "$prog_path$prog_title.$datetimestamp.lst"; #chomp($sas_lst_file); #Necessary for some reason to remove newline character my $sas_exe = "/bin/sas/sas"; #Physical location of the SAS execution script/file (customize for your box) my $command_line = "$sas_exe -RSASUSER -noterminal -sysin $sas_prog_file -log $sas_log_file -print $sas_lst_file"; #Complete batch SAS command #Shell out and run sas synchronously with nohup (no hangups) command and > /dev/null to suppresses stdout feedback on the command line system("nohup $command_line > /dev/null"); #Now that SAS finished running, check the results by parsing the SAS log file. #Error, warning, uninitialized variables, ignored errors/warnings, and presence of finish flag will be tracked my @error_lines = (); #Array of SAS log error lines my @warning_lines = (); #Array of SAS log warning lines my @uninit_lines = (); #Array of SAS log uninitialized variables lines my @ignored_lines = (); #Array of SAS log error/warning lines that are being ignored because they don't constitute "real" problems my $finished_flag = 0; #Did the program run until the end (where there is a %put FINISHED SAS statement)? #Loop through all lines of the SAS log looking for errors, warnings, and so on... my $line_counter = 0; open (LOGFILE, $sas_log_file) or die $!; while (my $line = ) { #Loop through every line of the SAS log file $line_counter = $line_counter + 1; #First check if line is a warning that can be ignored if ( $line =~ /^WARNING: Unable to copy SASUSER registry to WORK registry.*$/ | $line =~ /^WARNING: No preassigned object definitions were found.*$/ | $line =~ /^WARNING: In-database formatting is not available on the database.*$/ | $line =~ /^WARNING: Data too long for column.*$/ | $line =~ /^WARNING: The current setting of the DIRECT_EXE libname option will not allow this SQL statement.*$/ ) { chomp($line); push @ignored_lines, "[$line_counter] $line"; } #Check if a real error elsif ($line =~ /^ERROR/) { chomp($line); push @error_lines, "[$line_counter] $line"; } #Check if a warning elsif ($line =~ /^WARNING/) { chomp($line); push @warning_lines, "[$line_counter] $line"; } #Check if an uninitialized variable note elsif ($line =~ /uninitialized/) { chomp($line); push @uninit_lines, "[$line_counter] $line"; } #Check if it is the finished flag elsif ($line =~ /FINISHED/) { $finished_flag = 1; } } close (LOGFILE); #Finished parsing the SAS log file #Add error lines to the notes, if any my $i = 0; my $tempcount = $#error_lines + 1; $notes = $notes . "\n# Error lines: $tempcount\n"; if ($#error_lines >= 0) { for($i=0; $i <= $#error_lines; ++$i) { $notes = $notes . $error_lines[$i] . "\n"; } } #Add warning lines to the notes, if any $tempcount = $#warning_lines + 1; $notes = $notes . "\n# Warning lines: $tempcount\n"; if ($#warning_lines >= 0) { for($i=0; $i <= $#warning_lines; ++$i) { $notes = $notes . $warning_lines[$i] . "\n"; } } #Add uninitialized variables lines to the notes, if any $tempcount = $#uninit_lines + 1; $notes = $notes . "\n# Uninitialized lines: $tempcount\n"; if ($#uninit_lines >= 0) { for($i=0; $i <= $#uninit_lines; ++$i) { $notes = $notes . $uninit_lines[$i] . "\n"; } } #Add ignored lines to the notes, if any $tempcount = $#ignored_lines + 1; $notes = $notes . "\n# Ignored error/warning lines: $tempcount\n"; if ($#ignored_lines >= 0) { for($i=0; $i <= $#ignored_lines; ++$i) { $notes = $notes . $ignored_lines[$i] . "\n"; } } #Add "finished flag" status to the notes if no errors, warnings, uninits and make email subject line. #Starting the email subject line with [SCHEDULED SAS PROGRAM] makes it easy to see these in the inbox. my $email_subject_line = "[SCHEDULED SAS PROGRAM] $prog_title "; if ($#error_lines == -1 && $#warning_lines == -1 && $#uninit_lines == -1) { if ($finished_flag == 1) { #All appears to have gone well $notes = $notes . "\nJob finished successfully!\n"; $email_subject_line = $email_subject_line . "finished successfully"; } else { #Something weird happened, or the %put FINISHED line is missing $notes = $notes . "\nNo errors, warnings, or uninitialized variables, but job did NOT finish!\n"; $email_subject_line = $email_subject_line . "did NOT finish"; } } else { if ($#error_lines == -1 && $#uninit_lines == -1 && $#warning_lines >= 0) { $email_subject_line = $email_subject_line . "finished with warnings"; } elsif ($#error_lines == -1 && $#uninit_lines >= 0 && $#warning_lines == -1) { $email_subject_line = $email_subject_line . "finished with uninitialized variables"; } else { $email_subject_line = $email_subject_line . "finished with errors"; } } my $end_time = time; #Capture end of process time #Calculate human-readable elapsed time string my $elapsed_time = $end_time - $start_time; #In seconds my $hours = int($elapsed_time / 60 / 60); my $minutes = int(($elapsed_time-$hours*3600) / 60); my $seconds = $elapsed_time - $hours*3600 - $minutes*60; my $human_readable_elapsed_time = sprintf '%dh:%02dm:%02ds', $hours, $minutes, $seconds; $notes = $notes . sprintf "\nEnd time: %s\n", scalar(localtime($end_time)); $notes = $notes . "Elapsed time: $human_readable_elapsed_time\n\n"; #Send notification email #Create a new email my $msg = MIME::Lite->new( From => "$email_recipient", To => "$email_recipient", Subject => "$email_subject_line", Type => 'text/plain', Encoding => '8bit', Data => "$notes" ); #Attach the log file if it exists unless (!(-e $sas_log_file)) { $msg->attach( Type => 'application/octet-stream', Encoding => 'base64', Path => "$sas_log_file", Filename => "$prog_title.$datetimestamp.log" ); } #Attach the lst file if it exists unless (!(-e $sas_lst_file)) { $msg->attach( Type => 'application/octet-stream', Encoding => 'base64', Path => "$sas_lst_file", Filename => "$prog_title.$datetimestamp.lst" ); } #Send the email $msg->send; #End of process } else { #Program was not called with the expected 2 command line parameters die "2 command line arguments (1. SAS program, 2. email) expected and not found.\n"; }
No comments:
Post a Comment