Commit cf2c7abd authored by Paul McCarthy's avatar Paul McCarthy 🚵
Browse files

Merge branch 'rf/unknowns' into 'master'

Rf/unknowns

See merge request !77
parents de0bf7ed 194ce02b
......@@ -7,8 +7,7 @@ pip install -r requirements-demo.txt
pip install -r requirements-test.txt
export PYTHONPATH=$(pwd)
python .ci/generate_notebooks.py html
python setup.py doc
mkdir public
mv index.html public
cp -r doc/html/* public/
......@@ -2,6 +2,25 @@ FUNPACK changelog
=================
2.6.0 (Monday 29th March 2021)
------------------------------
Changed
^^^^^^^
* Documentation for FUNPACK can now be found online at
https://open.win.ox.ac.uk/pages/fsl/funpack/.
* The way that "uncategorised" variables are identified has been changed.
Previously, a variable which was not in a category *and* which had no
cleaning/processing rules specified was classified as uncategorised. This
has been simplified so that a variable which not in a category is classified
as uncategorised. This has been done to avoid the possibility of missing
newly added variables which use an existing data coding, and therefore may
implicitly have cleaning rules specified for them.
2.5.2 (Monday 15th March 2021)
------------------------------
......@@ -147,9 +166,9 @@ Changed
^^^^^^^
* Modified the :func:`.binariseCategorical` function so that it parallelises
tasks internally, instead of being called in parallel for different
variables. This should give superior performance (!60).
* Modified the :func:`.processing_functions.binariseCategorical` function so
that it parallelises tasks internally, instead of being called in parallel
for different variables. This should give superior performance (!60).
* Revisited the :meth:`.DataTable.merge` to optimise performance in all
scenarios (!60).
* Improved performance of the :mod:`.fmrib` date/time normalisation routines,
......@@ -179,13 +198,16 @@ Changed
^^^^^^^
* Substantial performance improvements to the :func:`.codeToNumeric` cleaning
function, and to :func:`.removeIfRedundant`, :func:`.binariseCategorical`,
and other processing functions.
* The default implementation of :func:`.removeIfRedundant` now uses matrix
algebra rather thsn pairwise comparisons. This requires more memory, but
is much faster.
* Added [`threadpoolctl`](https://github.com/joblib/threadpoolctl/) as a
* Substantial performance improvements to the
:func:`.cleaning_functions.codeToNumeric` cleaning function, and to
:func:`.processing_functions.removeIfRedundant`,
:func:`.processing_functions.binariseCategorical`, and other processing
functions.
* The default implementation of
:func:`.processing_functions.removeIfRedundant` now uses matrix algebra
rather thsn pairwise comparisons. This requires more memory, but is much
faster.
* Added `threadpoolctl <https://github.com/joblib/threadpoolctl/>`_ as a
dependency, for setting the number of threads to use when parallelising
``numpy`` operations.
......@@ -320,8 +342,8 @@ Changed
* The ``--config_file`` option can be used more than once, and can also be
used from within a configuration file (i.e. one configuration file may
"include" another).
* Changed the way that the :func:`.removeIfRedundant` process splits up
the data set for parallel processing.
* Changed the way that the :func:`.processing_functions.removeIfRedundant`
process splits up the data set for parallel processing.
1.8.1 (Wednesday 19th February 2020)
......@@ -332,7 +354,8 @@ Added
^^^^^
* New ``naval`` option to the :func:`.removeIfSparse` processing function.
* New ``naval`` option to the :func:`.processing_functions.removeIfSparse`
processing function.
Changed
......@@ -351,12 +374,13 @@ Added
^^^^^
* New ``take`` option to the :func:`.binariseCategorical` processing function,
which allows the generated columns to contain values from another column,
instead of containing binary labels.
* New ``fillval`` option to the :func:`.binariseCategorical` processing
function, which can be used in conjunction with ``take``, to specify the
fill value for missing rows.
* New ``take`` option to the :func:`.processing_functions.binariseCategorical`
processing function, which allows the generated columns to contain values
from another column, instead of containing binary labels.
* New ``fillval`` option to the
:func:`.processing_functions.binariseCategorical` processing function, which
can be used in conjunction with ``take``, to specify the fill value for
missing rows.
* Argument **broadcasting** for processing functions - when a process is
applied independently to more than one variable, the input arguments to the
process may need to be different for each variable. This can be accomplished
......@@ -460,7 +484,7 @@ Added
* Non-numeric variables can now be used in conditional expressions, e.g.
``'v41202 == "A009"'`. Within such expressions, the value must be contained
``'v41202 == "A009"'``. Within such expressions, the value must be contained
within single or double quotes.
* New ``contains`` operator, for use within conditional expressions to test
presence of sub-strings.
......@@ -554,9 +578,9 @@ Fixed
^^^^^
* Fixed a bug where non-numeric variables (e.g. `41271
<https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=41271>`_ ) were being
interpreted by ``pandas`` as being numeric.
* Fixed a bug where non-numeric variables (e.g.
`41271 <https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=41271>`_ ) were
being interpreted by ``pandas`` as being numeric.
1.4.4 (Friday 15th November 2019)
......@@ -620,12 +644,12 @@ Added
to specify the type to use internally for a given variable
(e.g. ``float64``). This is so that the default type of ``float32`` can be
overridden for specific variables for which this is problematic, such as
variable :ref:`20003
<https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=20003>`_. This column is
initially populated from ``funpack/data/type.txt``.
variable
`20003 <https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=20003>`_.
This column is initially populated from ``funpack/data/type.txt``.
* New :mod:`funpack.coding` module, for retrieving descriptive information
about data codings. The information is stored in the
``funpack/data/coding/``directory. Hierarchical data codings are still
``funpack/data/coding/`` directory. Hierarchical data codings are still
accessed via the :mod:`.hierarchy` module.
* New :func:`hierarchicalDescriptionFromCode`,
:func:`hierarchicalDescriptionFromNumeric`, and
......@@ -637,8 +661,8 @@ Changed
^^^^^^^
* The hierarchical coding name no longer needs to be specified when using the
:func:`.codeToNumeric` cleaning function - the coding is automatically looked
up.
:func:`.cleaning_functions.codeToNumeric` cleaning function - the coding is
automatically looked up.
* Variable 4288 has been moved from ``cognitive phenotypes`` to
``miscellaneous`` in the FMRIB categories.
* Variable 20003 is now binarised in the FMRIB categories.
......@@ -692,8 +716,8 @@ Added
^^^^^
* New :func:`.codeToNumeric` cleaning function, for transforming hierarhical
variable codes.
* New :func:`.cleaning_functions.codeToNumeric` cleaning function, for
transforming hierarhical variable codes.
* New :func:`.hierarchy.codeToNumeric` and
:func:`.hierarchy.numericToCode` functions.
* New meta-process functions for generating descriptions for ICD9, OPCS3 and
......@@ -729,9 +753,9 @@ Deprecated
^^^^^^^^^^
* The :func:`.convertICD10Codes` cleaning function has been replaced by
the new :func:`.codeToNumeric` function, which can be used with any
hierarchical variable.
* The :func:`.convertICD10Codes` cleaning function has been replaced by the
new :func:`.cleaning_functions.codeToNumeric` function, which can be used
with any hierarchical variable.
* The :func:`.icd10.codeToNumeric` and :func:`.icd10.numericToCode` functions
have been replaced by the :func:`.hierarchy.codeToNumeric` and
:func:`.hierarchy.numericToCode` functions.
......@@ -928,8 +952,8 @@ Changed
^^^^^^^
* The :func:`.binariseCategorical` function sets the categorical value as
column metadata on the new binarised columns.
* The :func:`.processing_functions.binariseCategorical` function sets the
categorical value as column metadata on the new binarised columns.
0.20.1 (Wednesday 8th May 2019)
......@@ -1169,8 +1193,8 @@ Fixed
^^^^^
* Fixed an issue with the :func:`.binariseCategorical` processing function
being applied to ICD10 codes.
* Fixed an issue with the :func:`.processing_functions.binariseCategorical`
processing function being applied to ICD10 codes.
0.14.7 (Sunday 17th March 2019)
......@@ -1752,8 +1776,8 @@ Fixed
^^^^^
* The :func:`.binariseCategorical` function now works on data with missing
values.
* The :func:`.processing_functions.binariseCategorical` function now works on
data with missing values.
0.3.0
......@@ -1766,11 +1790,12 @@ Added
* New :meth:`.DataTable.addColumns` method, so processing functions can
now add new columns.
* New :func:`.binariseCategorical` processing function, which expands a
categorical column into multiple binary columns, one for each unique
value in the data.
* New :func:`.expandCompound` processing function, which expands a
compound column into columns, one for each value in the compound data.
* New :func:`.processing_functions.binariseCategorical` processing function,
which expands a categorical column into multiple binary columns, one for
each unique value in the data.
* New :func:`.processing_functions.expandCompound` processing function, which
expands a compound column into columns, one for each value in the compound
data.
* Keyword arguments can now be used when specifying processing.
......
......@@ -43,13 +43,18 @@ Or from ``conda-forge``::
conda install -c conda-forge fmrib-unpack
The FUNPACK source code can be found at
https://git.fmrib.ox.ac.uk/fsl/funpack/.
Introductory notebook
---------------------
The ``funpack_demo`` command will start a Jupyter Notebook which introduces
the main features provided by FUNPACK. A non-interactive version of this
notebook can be found at https://open.win.ox.ac.uk/pages/fsl/funpack/.
notebook can be found at
https://open.win.ox.ac.uk/pages/fsl/funpack/demo.html.
If you are using ``pip``, you need to install a few additional dependencies::
......
.. include:: ../CHANGELOG.rst
......@@ -41,6 +41,7 @@ extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.viewcode',
'sphinx.ext.autosummary',
'nbsphinx'
]
# Add any paths that contain templates here, relative to this directory.
......@@ -54,7 +55,7 @@ exclude_patterns = []
# -- Options for HTML output -------------------------------------------------
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'sphinx_rtd_theme'
......
funpack.cleaning module
=======================
``funpack.cleaning``
====================
.. automodule:: funpack.cleaning
:members:
......
funpack.cleaning\_functions module
==================================
``funpack.cleaning_functions``
==============================
.. automodule:: funpack.cleaning_functions
:members:
......
funpack.coding module
=====================
``funpack.coding``
==================
.. automodule:: funpack.coding
:members:
......
funpack.config module
=====================
``funpack.config``
==================
.. automodule:: funpack.config
:members:
......
funpack.custom module
=====================
``funpack.custom``
==================
.. automodule:: funpack.custom
:members:
......
funpack.datatable module
========================
``funpack.datatable``
=====================
.. automodule:: funpack.datatable
:members:
......
funpack.dryrun module
=====================
``funpack.dryrun``
==================
.. automodule:: funpack.dryrun
:members:
......
funpack.exporting module
========================
``funpack.exporting``
=====================
.. automodule:: funpack.exporting
:members:
......
funpack.exporting\_hdf5 module
==============================
``funpack.exporting_hdf5``
==========================
.. automodule:: funpack.exporting_hdf5
:members:
......
funpack.exporting\_tsv module
=============================
``funpack.exporting_tsv``
=========================
.. automodule:: funpack.exporting_tsv
:members:
......
funpack.expression module
=========================
``funpack.expression``
======================
.. automodule:: funpack.expression
:members:
......
funpack.fileinfo module
=======================
``funpack.fileinfo``
====================
.. automodule:: funpack.fileinfo
:members:
......
funpack.hierarchy module
========================
``funpack.hierarchy``
=====================
.. automodule:: funpack.hierarchy
:members:
......
funpack.icd10 module
====================
``funpack.icd10``
=================
.. automodule:: funpack.icd10
:members:
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment