Commit e0f968bf authored by Paul McCarthy's avatar Paul McCarthy 🚵
Browse files

DOC: Tweaks to readme/homepage, fixes to changelog

parent b19809fa
......@@ -164,9 +164,9 @@ Changed
^^^^^^^
* Modified the :func:`.binariseCategorical` function so that it parallelises
tasks internally, instead of being called in parallel for different
variables. This should give superior performance (!60).
* Modified the :func:`.processing_functions.binariseCategorical` function so
that it parallelises tasks internally, instead of being called in parallel
for different variables. This should give superior performance (!60).
* Revisited the :meth:`.DataTable.merge` to optimise performance in all
scenarios (!60).
* Improved performance of the :mod:`.fmrib` date/time normalisation routines,
......@@ -196,13 +196,16 @@ Changed
^^^^^^^
* Substantial performance improvements to the :func:`.codeToNumeric` cleaning
function, and to :func:`.removeIfRedundant`, :func:`.binariseCategorical`,
and other processing functions.
* The default implementation of :func:`.removeIfRedundant` now uses matrix
algebra rather thsn pairwise comparisons. This requires more memory, but
is much faster.
* Added [`threadpoolctl`](https://github.com/joblib/threadpoolctl/) as a
* Substantial performance improvements to the
:func:`.cleaning_functions.codeToNumeric` cleaning function, and to
:func:`.processing_functions.removeIfRedundant`,
:func:`.processing_functions.binariseCategorical`, and other processing
functions.
* The default implementation of
:func:`.processing_functions.removeIfRedundant` now uses matrix algebra
rather thsn pairwise comparisons. This requires more memory, but is much
faster.
* Added `threadpoolctl <https://github.com/joblib/threadpoolctl/>`_ as a
dependency, for setting the number of threads to use when parallelising
``numpy`` operations.
......@@ -337,8 +340,8 @@ Changed
* The ``--config_file`` option can be used more than once, and can also be
used from within a configuration file (i.e. one configuration file may
"include" another).
* Changed the way that the :func:`.removeIfRedundant` process splits up
the data set for parallel processing.
* Changed the way that the :func:`.processing_functions.removeIfRedundant`
process splits up the data set for parallel processing.
1.8.1 (Wednesday 19th February 2020)
......@@ -349,7 +352,8 @@ Added
^^^^^
* New ``naval`` option to the :func:`.removeIfSparse` processing function.
* New ``naval`` option to the :func:`.processing_functions.removeIfSparse`
processing function.
Changed
......@@ -368,12 +372,13 @@ Added
^^^^^
* New ``take`` option to the :func:`.binariseCategorical` processing function,
which allows the generated columns to contain values from another column,
instead of containing binary labels.
* New ``fillval`` option to the :func:`.binariseCategorical` processing
function, which can be used in conjunction with ``take``, to specify the
fill value for missing rows.
* New ``take`` option to the :func:`.processing_functions.binariseCategorical`
processing function, which allows the generated columns to contain values
from another column, instead of containing binary labels.
* New ``fillval`` option to the
:func:`.processing_functions.binariseCategorical` processing function, which
can be used in conjunction with ``take``, to specify the fill value for
missing rows.
* Argument **broadcasting** for processing functions - when a process is
applied independently to more than one variable, the input arguments to the
process may need to be different for each variable. This can be accomplished
......@@ -477,7 +482,7 @@ Added
* Non-numeric variables can now be used in conditional expressions, e.g.
``'v41202 == "A009"'`. Within such expressions, the value must be contained
``'v41202 == "A009"'``. Within such expressions, the value must be contained
within single or double quotes.
* New ``contains`` operator, for use within conditional expressions to test
presence of sub-strings.
......@@ -571,9 +576,9 @@ Fixed
^^^^^
* Fixed a bug where non-numeric variables (e.g. `41271
<https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=41271>`_ ) were being
interpreted by ``pandas`` as being numeric.
* Fixed a bug where non-numeric variables (e.g.
`41271 <https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=41271>`_ ) were
being interpreted by ``pandas`` as being numeric.
1.4.4 (Friday 15th November 2019)
......@@ -637,12 +642,12 @@ Added
to specify the type to use internally for a given variable
(e.g. ``float64``). This is so that the default type of ``float32`` can be
overridden for specific variables for which this is problematic, such as
variable :ref:`20003
<https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=20003>`_. This column is
initially populated from ``funpack/data/type.txt``.
variable
`20003 <https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=20003>`_.
This column is initially populated from ``funpack/data/type.txt``.
* New :mod:`funpack.coding` module, for retrieving descriptive information
about data codings. The information is stored in the
``funpack/data/coding/``directory. Hierarchical data codings are still
``funpack/data/coding/`` directory. Hierarchical data codings are still
accessed via the :mod:`.hierarchy` module.
* New :func:`hierarchicalDescriptionFromCode`,
:func:`hierarchicalDescriptionFromNumeric`, and
......@@ -654,8 +659,8 @@ Changed
^^^^^^^
* The hierarchical coding name no longer needs to be specified when using the
:func:`.codeToNumeric` cleaning function - the coding is automatically looked
up.
:func:`.cleaning_functions.codeToNumeric` cleaning function - the coding is
automatically looked up.
* Variable 4288 has been moved from ``cognitive phenotypes`` to
``miscellaneous`` in the FMRIB categories.
* Variable 20003 is now binarised in the FMRIB categories.
......@@ -709,8 +714,8 @@ Added
^^^^^
* New :func:`.codeToNumeric` cleaning function, for transforming hierarhical
variable codes.
* New :func:`.cleaning_functions.codeToNumeric` cleaning function, for
transforming hierarhical variable codes.
* New :func:`.hierarchy.codeToNumeric` and
:func:`.hierarchy.numericToCode` functions.
* New meta-process functions for generating descriptions for ICD9, OPCS3 and
......@@ -746,9 +751,9 @@ Deprecated
^^^^^^^^^^
* The :func:`.convertICD10Codes` cleaning function has been replaced by
the new :func:`.codeToNumeric` function, which can be used with any
hierarchical variable.
* The :func:`.convertICD10Codes` cleaning function has been replaced by the
new :func:`.cleaning_functions.codeToNumeric` function, which can be used
with any hierarchical variable.
* The :func:`.icd10.codeToNumeric` and :func:`.icd10.numericToCode` functions
have been replaced by the :func:`.hierarchy.codeToNumeric` and
:func:`.hierarchy.numericToCode` functions.
......@@ -945,8 +950,8 @@ Changed
^^^^^^^
* The :func:`.binariseCategorical` function sets the categorical value as
column metadata on the new binarised columns.
* The :func:`.processing_functions.binariseCategorical` function sets the
categorical value as column metadata on the new binarised columns.
0.20.1 (Wednesday 8th May 2019)
......@@ -1186,8 +1191,8 @@ Fixed
^^^^^
* Fixed an issue with the :func:`.binariseCategorical` processing function
being applied to ICD10 codes.
* Fixed an issue with the :func:`.processing_functions.binariseCategorical`
processing function being applied to ICD10 codes.
0.14.7 (Sunday 17th March 2019)
......@@ -1769,8 +1774,8 @@ Fixed
^^^^^
* The :func:`.binariseCategorical` function now works on data with missing
values.
* The :func:`.processing_functions.binariseCategorical` function now works on
data with missing values.
0.3.0
......@@ -1783,11 +1788,12 @@ Added
* New :meth:`.DataTable.addColumns` method, so processing functions can
now add new columns.
* New :func:`.binariseCategorical` processing function, which expands a
categorical column into multiple binary columns, one for each unique
value in the data.
* New :func:`.expandCompound` processing function, which expands a
compound column into columns, one for each value in the compound data.
* New :func:`.processing_functions.binariseCategorical` processing function,
which expands a categorical column into multiple binary columns, one for
each unique value in the data.
* New :func:`.processing_functions.expandCompound` processing function, which
expands a compound column into columns, one for each value in the compound
data.
* Keyword arguments can now be used when specifying processing.
......
......@@ -49,7 +49,8 @@ Introductory notebook
The ``funpack_demo`` command will start a Jupyter Notebook which introduces
the main features provided by FUNPACK. A non-interactive version of this
notebook can be found at https://open.win.ox.ac.uk/pages/fsl/funpack/.
notebook can be found at
https://open.win.ox.ac.uk/pages/fsl/funpack/demo.html.
If you are using ``pip``, you need to install a few additional dependencies::
......
......@@ -3,10 +3,11 @@ FUNPACK
.. toctree::
:hidden:
:maxdepth: 0
self
changelog
demo
demo.ipynb
funpack.cleaning
funpack.cleaning_functions
funpack.coding
......
funpack
=======
.. toctree::
:maxdepth: 4
funpack
doc/win.png

27.3 KB

Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment