This section provides a comprehensive overview and review of the similarities and differences between SAS and R so that organizations can make more informed decisions in their SAS to R migration plans and vision.
For sponsors who want to keep SAS for legacy studies, they may be interested to use R for custom graphs for data visualization or Shiny apps for user and clinical data interactions. For sponsors who want to be submission ready, it makes sense to apply caution until all R packages have been installed, tested and can produce SDTMs, ADaMs and TLGs.
While most everything from SAS can be replicated in R, there is a steep learning curve since R concepts and process flow are more object oriented. R has meanings for special characters such as [], {} and () for example. In addition, most of R syntax consists of functions which are similar to SAS functions and macro programs. So, knowing how to call SAS functions and macros will help to understand, write and execute R functions. Like SAS macro programs, R functions can have positional, keyword and default parameters. See list of R packages install in SAS LSAF.
Few SAS and similar R terms are listed below.
- data set data frame
- observations # rownames()
- data set options () data frame options []
- label Hmisc::label()
- variable vector
- types: numeric, character, date numeric, character, date
- N/A list variable type
- modules, procs & functions R packages and functions (ex. tidyverse)
- data steps: retain, if then, vr=, by any(), ifelse(), mutate(), group() to replicate
- output not easily to replicate
- first., last. slice(1), slice(n())
- do loops, arrays for loops with data frame index references
- proc sql dplyr: select(), mutate(), filter(), case_when(), arrange(), group_by(), %>%
-
left join, right join, inner join, full outer join left_join(), right_join(), inner_join(), full_join()
- subqueries mutate(), summarize(), left_join() to replicate
- proc print print()
- proc freq tables()
- proc means summarize()
- proc contents Hmisc: contents()
- proc compare diffdf()
- proc sort, nodup arrange(), group_by_all()
- proc transpose pivot_longer(), pivot_wider()
- functions R functions
-
min, max, mean, sum, median, std min, max, mean, sum, median, sd
-
lowcase() tolower()
- upcase() toupper()
- lag(), lead() lag(), lead()
-
tranwrd() str_replace_all()
- strip() str_trim()
- compress() str_extract()
-
find() str_detect()
- substr() str_sub()
-
catx() paste()
- scan() strsplit()
- index() grep()
- input() as.numeric()
- put() as.character()
- length() width option in format(), nchar() returns the # of characters in variable
- length() returns # of variables or # of records
- count() count
- macro programs R functions and user defined functions
- macro variables Vectors with one or more values
- global macro variables Vectors with one or more values, ex. x <- 'Y'
- local macro variables Variables defined within R functions
- defults and keyword parameters defaults and keyword parameters
- ODS R Markdown
- Logs logrx