Skip to content

Latest commit

 

History

History
738 lines (540 loc) · 23.1 KB

File metadata and controls

738 lines (540 loc) · 23.1 KB

Documentation for the evExpress package.


Evaluate text string as an expression in SAS


Version information:

  • Package: evExpress
  • Version: 0.0.4
  • Generated: 2026-02-17T13:19:15
  • Author(s): Bartosz Jablonski ([email protected])
  • Maintainer(s): Bartosz Jablonski ([email protected])
  • License: MIT
  • File SHA256: F*9C8F32B00FD7FFCC184B79D46ECFB523D85EAF54DEFE375798743F28D8FA3677 for this version
  • Content SHA256: C*677ECA2A7ABE4061EBD71273EB1E4898CB43522E1DF5BF22B14AA924DDB02792 for this version

The evExpress package, version: 0.0.4;


The evExpress package is a bunch of macros and functions design to help evaluate SAS expressions that are provided in a form of a text string.

Under the hood the DoSubL() finction is used, this means that the package is not desing for a big number of expressions. If the number of expresions exceeds 10000, consider other options because the total time of execution may exceed your expectations (for example, a batch of 6000 expressions took around 16 secods on the test machine).

See examples for more details.




SAS package generated by SAS Package Framework, version 20260216, under WIN(X64_10PRO) operating system, using SAS release: 9.04.01M9P06042025.


The evExpress package content

The evExpress package consists of the following content:

  1. %evexpressds() macro

  2. evexpress() function

  3. evexpressc() function

  4. evexpresscq() function

  5. evexpressq() function

  6. License note


%evexpressds() macro

DESCRIPTION:

The %evExpressDS() macro allows for automatic evaluation of expressions provided in form of a text string in context of an observation from an input data set. By default macro assumes the result of the evaluation is numeric, but it can be altered.

For example, if the input data set have contains the following data:

code A B
A + B 17 42
A = B 100 101
A or B 1 0

the result data set want will contain 2 additional variables: errorCheck and result, containing status and value of the expression execution, respectively:

code A B errorCheck result
A + B 17 42 0 59
A = B 100 101 0 0
A or B 1 0 0 1

If the errorCheck is non-zero, this means that the expression could not be evaluated due to an error.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

%evExpressDS(
   have
 <,want=>
 <,exp=>
 <,ec=>
 <,res=>
 <,engine=>
 <,quiet=>
 <,strict=>
 <,numeric=1>
)

Arguments description:

  1. have - Required. Valid data set name. If the data set does not exist, the macro stops. No data set options should be used. In particular, the where= option is not recommended.
  • want= - Optional. Valid data set name. Name of the result data set created by the macro. Default value is WORK.WANT.

  • exp= - Optional. Valid variable name. Name of a character variable containing expression string for evaluation. Default value is expression.

  • ec= - Optional. Valid variable name. Name of a numeric variable containing return code from expression evaluation. Non-zero value means that an error occurred while expression evaluation. Default value is errorCheck.

  • res= - Optional. Valid variable name. Name of a numeric variable containing result of the expression evaluation. Default value is result.

  • engine= - Optional. Valid name of SAS engine. Only BASE(V9) and SPDE engines are supported. Default value is BASE.

  • quiet= - Optional. Indicates if execution errors and notes should be printed in the log. Allowed values are 1 and 0. Default value of 1 means: do not print.

  • strict= - Optional. Indicates if the implicit type conversion is considered an errors. Allowed values are 1 and 0. Default value of 1 means: conversion is an error.

  • char= - Optional. Indicates if the result variable is numeric or character. Allowed values are integers from 0 to 32767. Default value of 0 means: variable is numeric. Other value also sets character variable length.


EXAMPLES AND USECASES:

EXAMPLE 1. Basic use case - evaluate expressions in the code variable:

data have;
 length x y 8 c $ 1 code $ 50;
 code = '(x > 1) and c="A"';             x= 3; y=11; c="A"; output;
 code = '(x < 1 and (9 < y < 13))';      x=-3; y=17; c="B"; output;
 code = '(x < 1) or C in ("C","D","E")'; x= 3; y=11; c="C"; output;
 code = '(3 - sqrt(x+y))';               x= 5; y=11; c="D"; output;
 code = '(12 + sum(of x--y))';           x= 3; y=9;  c="E"; output;
 code = 'sin(x)**2 + cos(x)**2';         x=-3; y=11; c="F"; output;
run;

%evExpressDS(have,exp=code)

proc print data=work.want;
run;

EXAMPLE 2. Compare implicit types conversion:

data have2;
 length c $ 1 expression $ 50;
 expression = '"A" > 2'; c="G"; output;
 expression = ' C < 2';  c="H"; output;
run;

%evExpressDS(have2,want=work.want2str)

proc print data=work.want2str;
run;

%evExpressDS(have2,want=work.want2nostr,strict=0)

proc print data=work.want2nostr;
run;

EXAMPLE 3. Execution time considerations (use have from Ex.1):

data haveBigger;
  set have;
  do i=1 to 1000;
    output;
  end;
  rename code = expression;
run;

%evExpressDS(haveBigger,want=work.wantBigger)

proc print data=work.wantBigger;
run;

Note produced: "[EVEXPRESSDS] Processing time: 16.227" seconds.

EXAMPLE 4. Expression that evaluates to character value:

data have4;
 length c $ 1 expression $ 50;
 expression = '"A" !! C !! "B"'; c="i"; output;
 expression = 'repeat(C, 42)';   c="j"; output;
run;

%evExpressDS(have4,want=work.want4,char=43)

proc print data=work.want4;
run;


evexpress() function

DESCRIPTION:

The evExpress(), evExpressQ(), evExpressC(), and evExpressCQ() functions allow for automatic evaluation of expressions provided in form of a text string in context of an observation from an input data set.

Functions without C assume the result of the evaluation is numeric, but functions with C at the end return character value.

All functions expect 3 parameters: expression, indsname, and curobs.

The first parameter is a name of a variable with a string to be evaluated. The second is the name of a data set with data used in evaluation. The third is the observation number (from that data set) for which evaluation is done.

When the evExpress<C>() is executed, if there is an error during the evaluation, for every observation a warning is issued and all error messages are printed in the log.

Implicit types conversion is considered an error.

When the evExpress<C>Q() is executed (Q stands for quiet) no error messages are printed in the log.

For every DATA step using the evExpress<C>() function two global macro variables are created:

  • EEF_CHECK_**************** and
  • EEF_PASSED_****************.

The evExpress<C>Q() creates only the second macro variable.

Those asterisks represent hex16. formatted value of the timestamp of the execution start. Those macro variables are not automatically removed from the session.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

var1 = evExpress<C>(
  expression 
, indsname 
, curobs
);

var2 = evExpress<C>Q(
  expression 
, indsname 
, curobs
);

Arguments description:

  1. expression - Required. Character variable containing string with expression to be evaluated.

  2. indsname - Required. Character variable containing the name of a data set with variables used in evaluation. If the expression variable contains only constant expressions (i.e., expressions not using any variables, like: 46 & 2 or 21 + 21) an empty string can be provided as the value.

  3. curobs - Required. Numeric variable containing number of an observation that will provide data used during the evaluation.


EXAMPLES AND USECASES:

EXAMPLE 1. Basic use case - evaluate expressions in the code variable:

data have;
 length x y 8 c $ 1 code $ 50;
 code = '(x > 1) and c="A"';             x= 3; y=11; c="A"; output;
 code = '(x < 1 and (9 < y < 13))';      x=-3; y=17; c="B"; output;
 code = '(x < 1) or C in ("C","D","E")'; x= 3; y=11; c="C"; output;
 code = '(3 - sqrt(x+y))';               x= 5; y=11; c="D"; output;
 code = '(12 + sum(of x--y))';           x= 3; y=9;  c="E"; output;
 code = 'sin(x)**2 + cos(x)**2';         x=-3; y=11; c="F"; output;
run;

data want;
  set have 
    curobs=curobs 
    indsname=indsname;

  value = evExpress(code, indsname, curobs);
run;

proc print data=want;
run;

EXAMPLE 2. Errors are returned for implicit types conversion:

data have2;
 length c $ 1 expression $ 50;
 expression = '"A" > 2'; c="G"; output;
 expression = ' C  < 2'; c="H"; output;
 expression = ' C ** 2'; c="I"; output;
run;

data want2;
  set have2 
    curobs=curobs 
    indsname=indsname;

  value = evExpress(expression, indsname, curobs);
run;

proc print data=want2;
run;

EXAMPLE 3. Execution time considerations (use have from Ex.1):

data haveBigger;
  set have;
  do i=1 to 1000;
    output;
  end;
  rename code = expression;
run;

data wantBigger;
  set haveBigger 
    curobs=curobs 
    indsname=indsname;

  value = evExpress(expression, indsname, curobs);
run;

Log note:

NOTE: DATA statement used (Total process time):
      real time           16.80 seconds
      cpu time            15.62 seconds

EXAMPLE 4. Constant expressions:

data constantExpressions;
  infile cards;
  input;
  length exp $ 128;
  exp = strip(_infile_);
cards;
1 and (1 or 0)
3+5+7
2**8
("A"="B") + 17
3+2=5
;
run;

data want3;
  set constantExpressions;
  value = evExpress(exp,"",0);
run;

proc print data=want3;
run;


evexpressc() function

DESCRIPTION:

The evExpress(), evExpressQ(), evExpressC(), and evExpressCQ() functions allow for automatic evaluation of expressions provided in form of a text string in context of an observation from an input data set.

Functions without C assume the result of the evaluation is numeric, but functions with C at the end return character value.

All functions expect 3 parameters: expression, indsname, and curobs.

The first parameter is a name of a variable with a string to be evaluated. The second is the name of a data set with data used in evaluation. The third is the observation number (from that data set) for which evaluation is done.

When the evExpress<C>() is executed, if there is an error during the evaluation, for every observation a warning is issued and all error messages are printed in the log.

Implicit types conversion is considered an error.

When the evExpress<C>Q() is executed (Q stands for quiet) no error messages are printed in the log.

For every DATA step using the evExpress<C>() function two global macro variables are created:

  • EEF_CHECK_**************** and
  • EEF_PASSED_****************.

The evExpress<C>Q() creates only the second macro variable.

Those asterisks represent hex16. formatted value of the timestamp of the execution start. Those macro variables are not automatically removed from the session.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

var1 = evExpress<C>(
  expression 
, indsname 
, curobs
);

var2 = evExpress<C>Q(
  expression 
, indsname 
, curobs
);

Arguments description:

  1. expression - Required. Character variable containing string with expression to be evaluated.

  2. indsname - Required. Character variable containing the name of a data set with variables used in evaluation. If the expression variable contains only constant expressions (i.e., expressions not using any variables, like: 46 & 2 or 21 + 21) an empty string can be provided as the value.

  3. curobs - Required. Numeric variable containing number of an observation that will provide data used during the evaluation.


EXAMPLES AND USECASES:

EXAMPLE 1. Expression that evaluates to character value:

data have;
 length c $ 1 expression $ 50;
 expression = '"A" !! C !! "B"'; c="i"; output;
 expression = 'repeat(C, 42)';   c="j"; output;
run;

data want;
  set have 
    curobs=curobs 
    indsname=indsname;
  length value $ 43;
  value = evExpressC(expression, indsname, curobs);
run;

proc print data=want;
run;


evexpresscq() function

DESCRIPTION:

The evExpress(), evExpressQ(), evExpressC(), and evExpressCQ() functions allow for automatic evaluation of expressions provided in form of a text string in context of an observation from an input data set.

Functions without C assume the result of the evaluation is numeric, but functions with C at the end return character value.

All functions expect 3 parameters: expression, indsname, and curobs.

The first parameter is a name of a variable with a string to be evaluated. The second is the name of a data set with data used in evaluation. The third is the observation number (from that data set) for which evaluation is done.

When the evExpress<C>() is executed, if there is an error during the evaluation, for every observation a warning is issued and all error messages are printed in the log.

Implicit types conversion is considered an error.

When the evExpress<C>Q() is executed (Q stands for quiet) no error messages are printed in the log.

For every DATA step using the evExpress<C>() function two global macro variables are created:

  • EEF_CHECK_**************** and
  • EEF_PASSED_****************.

The evExpress<C>Q() creates only the second macro variable.

Those asterisks represent hex16. formatted value of the timestamp of the execution start. Those macro variables are not automatically removed from the session.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

var1 = evExpress<C>(
  expression 
, indsname 
, curobs
);

var2 = evExpress<C>Q(
  expression 
, indsname 
, curobs
);

Arguments description:

  1. expression - Required. Character variable containing string with expression to be evaluated.

  2. indsname - Required. Character variable containing the name of a data set with variables used in evaluation. If the expression variable contains only constant expressions (i.e., expressions not using any variables, like: 46 & 2 or 21 + 21) an empty string can be provided as the value.

  3. curobs - Required. Numeric variable containing number of an observation that will provide data used during the evaluation.



evexpressq() function

DESCRIPTION:

The evExpress(), evExpressQ(), evExpressC(), and evExpressCQ() functions allow for automatic evaluation of expressions provided in form of a text string in context of an observation from an input data set.

Functions without C assume the result of the evaluation is numeric, but functions with C at the end return character value.

All functions expect 3 parameters: expression, indsname, and curobs.

The first parameter is a name of a variable with a string to be evaluated. The second is the name of a data set with data used in evaluation. The third is the observation number (from that data set) for which evaluation is done.

When the evExpress<C>() is executed, if there is an error during the evaluation, for every observation a warning is issued and all error messages are printed in the log.

Implicit types conversion is considered an error.

When the evExpress<C>Q() is executed (Q stands for quiet) no error messages are printed in the log.

For every DATA step using the evExpress<C>() function two global macro variables are created:

  • EEF_CHECK_**************** and
  • EEF_PASSED_****************.

The evExpress<C>Q() creates only the second macro variable.

Those asterisks represent hex16. formatted value of the timestamp of the execution start. Those macro variables are not automatically removed from the session.

SYNTAX:

The basic syntax is the following, the <...> means optional parameters:

var1 = evExpress<C>(
  expression 
, indsname 
, curobs
);

var2 = evExpress<C>Q(
  expression 
, indsname 
, curobs
);

Arguments description:

  1. expression - Required. Character variable containing string with expression to be evaluated.

  2. indsname - Required. Character variable containing the name of a data set with variables used in evaluation. If the expression variable contains only constant expressions (i.e., expressions not using any variables, like: 46 & 2 or 21 + 21) an empty string can be provided as the value.

  3. curobs - Required. Numeric variable containing number of an observation that will provide data used during the evaluation.


EXAMPLES AND USECASES:

EXAMPLE 1. Basic use case - evaluate expressions in the code variable:

data have;
 length x y 8 c $ 1 code $ 50;
 code = '(x > 1) and c="A"';             x= 3; y=11; c="A"; output;
 code = '(x < 1 and (9 < y < 13))';      x=-3; y=17; c="B"; output;
 code = '(x < 1) or C in ("C","D","E")'; x= 3; y=11; c="C"; output;
 code = '("A" + sqrt(x+y))';             x= 5; y=11; c="D"; output;
 code = '(12 + + + C**2)';               x= 3; y=9;  c="E"; output;
 code = 'sin(z)**2 + cos(z)**2';         x=-3; y=11; c="F"; output;
run;

data want;
  set have 
    curobs=curobs 
    indsname=indsname;

  value = evExpressQ(code, indsname, curobs);
run;

proc print data=want;
run;



License

Copyright (c) Bartosz Jablonski, since 2025 onward.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.