Rf/expression logic
funpack
merge request
The way that funpack currently combines expressions on variables with different numbers of columns is slightly flawed. It attempts to combine results within visit. For example, in the expression v1 > 4 && v2 < 8
, if variables 1 and 2 both have columns for 3 visits, the expression will be evaluated separately on the columns for visits 1, 2, and 3, and then the results combined with a logical OR.
This approach falls apart when combining expressions on variables with a different number of columns, such as the ICD columns. For example, in the expression v41202 == "A001" && v25010 != na
(all subjects with ICD diagnosis A001
who have imaging data), FUNPACK would still attempt to pair up columns with matching visit/instance numbers, with the result that only the first couple of columns of 41202 would be considered in the expression.
This MR proposes that this logic is simplified, so that all columns of a variable are always used in the evaluation of an expression and that the default behaviour for combining binary expressions from variables with different numbers of columns is for the results to be collapsed with a loglcal OR before being combined to form the final result.
Unless the maintainer is being sloppy, this merge request will not be accepted unless the following criteria are met:
[ ] Unit tests pass
[ ] Changelog updated
[ ] Version number in funpack/__init__.py
updated according to
Semantic Versioning conventions