Cannot assume unique names in hierarchical variables
For example, in the ICD9 classification, there are two codes called Chapter V
. This causes collisions in e.g. flattenHierarchical
For example, in the ICD9 classification, there are two codes called Chapter V
. This causes collisions in e.g. flattenHierarchical
added bug label
But it doesn't cause a crash. The main culprit is the ICD9 classification - there are two top-level categories called Chapter V
.
If we create some dummy data (row 1 is from Chapter V Mental Disorders
, and row 2 from Chapter V - Supplementary ...
:
eid 41271-0.0
1 2901
2 V026
3 8000
4 5309
5 74559
and run it through funpack
:
funpack -ow -cl 41271 "flattenHierarchical" out.tsv data.tsv
we get the following:
eid 41271-0.0
1 Chapter V
2 Chapter V
3 Chapter XVII
4 Chapter IX
5 Chapter XIV
Is this a bug?
Maybe just change the name in FUNPACK's copy of the ICD9 spec to Chapter V(s)
or something
mentioned in merge request !69 (merged)
closed with merge request !69 (merged)
mentioned in commit 4d95d9ee