Commit 5fb96255 authored by Paul McCarthy's avatar Paul McCarthy 🚵
Browse files

ENH: Instancing code is now in variable table

parent 7b7b7a8a
......@@ -32,6 +32,9 @@ variable IDs really shoulld not conflict with actual UKB variable IDs.
class Column(object):
"""The ``Column`` is a simple container class containing metadata
about a single column in a data file.
See the :func:`.parseColumnName` function for important information
about column naming conventions in the UK BioBank.
def __init__(self,
......@@ -164,6 +164,7 @@ VARTABLE_COLUMNS = [
......@@ -209,6 +210,7 @@ VARTABLE_DTYPES = {
# have a data coding, and pandas uses
# np.nan to represent missing data.
'DataCoding' : np.float32,
'Instancing' : np.float32,
'NAValues' : object,
'RawLevels' : object,
'NewLevels' : object,
......@@ -615,6 +617,7 @@ def loadTableBases():
'Description' : fields['title'],
'DataCoding' : fields['encoding_id'],
'Instancing' : fields['instance_id'],
return varbase, dcbase
......@@ -83,6 +83,21 @@ def parseColumnName(name):
If ``name`` does not have one of the above forms, a :exc:`ValueError` is
.. note:: For the vast majority of biobank variables, the second number in
a column name (``visit`` above) corresponds to the assessment
visit. However, there are a small number of variables which are
not associated with a specific visit, and thus for which this
number does not corresopnd to a visit (e.g. variable 40006), but
to some other coding.
Confusingly, the UK Biobank showcase refers to the coding that a
variable adhers to as an "instancing", whilst also using the
term "instance" to refer to the columns of multi-valued
variables - the ``instance`` element of the column name.
The "instancing" that a variable uses is contained in the
``Instancing`` column of the variable table.
:arg name: Column name
:returns: A tuple containing:
- variable ID
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment