BF: Hierarchy encoding node IDs are not all sequential
I had mistakenly assumed that the node IDs in UKB hierarchical encodings were sequebtail, ranging from 1 to the number of nodes(e.g. the ICD10 coding). But this is not the case (e.g. coding 3).
This MR adjusts the Hierarchy
class so that it no longer makes any assumptions about the node IDs.
Also included in this MR:
- Built in field/encoding tables updated to latest from showcase
- Ensure encoding values are loaded with the correct type - for example, coding 87 uses values comprised of digits, but which are not numeric (e.g. leading zeros).
- New
numeric
/convertNumeric
options toflattenHierarchical
, so it can work with node IDs instead of coding labels (as the latter ar not necessarily unique for non-leaf nodes) - New documentation page for cleaning/processing functions
Edited by Paul McCarthy