Skip to content

BF: Hierarchy encoding node IDs are not all sequential

Paul McCarthy requested to merge bf/hierarchy into master

I had mistakenly assumed that the node IDs in UKB hierarchical encodings were sequebtail, ranging from 1 to the number of nodes(e.g. the ICD10 coding). But this is not the case (e.g. coding 3).

This MR adjusts the Hierarchy class so that it no longer makes any assumptions about the node IDs.

Also included in this MR:

  • Built in field/encoding tables updated to latest from showcase
  • Ensure encoding values are loaded with the correct type - for example, coding 87 uses values comprised of digits, but which are not numeric (e.g. leading zeros).
  • New numeric/convertNumeric options to flattenHierarchical, so it can work with node IDs instead of coding labels (as the latter ar not necessarily unique for non-leaf nodes)
  • New documentation page for cleaning/processing functions
Edited by Paul McCarthy

Merge request reports