UKB_UNCONFOUND_v2 issueshttps://git.fmrib.ox.ac.uk/falmagro/ukb_unconfound_v2/-/issues2021-02-22T10:44:13+00:00https://git.fmrib.ox.ac.uk/falmagro/ukb_unconfound_v2/-/issues/1gen_STRUCTMOTION/script_predict.py fails2021-02-22T10:44:13+00:00Tom Nicholsgen_STRUCTMOTION/script_predict.py failsWith some success, I've been using the `UKB_UNCONFOUND_v2` pipeline, but instead of installing the crucial `bb_pipeline_v_2.0` myself, I have been using the installed version on Rescomp.
This has worked fine until it came to run `gen_S...With some success, I've been using the `UKB_UNCONFOUND_v2` pipeline, but instead of installing the crucial `bb_pipeline_v_2.0` myself, I have been using the installed version on Rescomp.
This has worked fine until it came to run `gen_STRUCTMOTION/script_predict.py` which failed with the error
```
Traceback (most recent call last):
File "./script_predict.py", line 35, in <module>
final_model = pickle.load(open('MODEL/model.p','rb'))
File "/gpfs2/well/win/projects/ukbiobank/fbp/bb_pipeline_v_2.0/bb_python/bb_python/lib/python3.5/site-packages/sklearn/grid_search.py", line 24, in <module>
from .cross_validation import check_cv
File "/gpfs2/well/win/projects/ukbiobank/fbp/bb_pipeline_v_2.0/bb_python/bb_python/lib/python3.5/site-packages/sklearn/cross_validation.py", line 32, in <module>
from .metrics.scorer import check_scoring
File "/gpfs2/well/win/projects/ukbiobank/fbp/bb_pipeline_v_2.0/bb_python/bb_python/lib/python3.5/site-packages/sklearn/metrics/__init__.py", line 7, in <module>
from .ranking import auc
File "/gpfs2/well/win/projects/ukbiobank/fbp/bb_pipeline_v_2.0/bb_python/bb_python/lib/python3.5/site-packages/sklearn/metrics/ranking.py", line 32, in <module>
from ..utils.stats import rankdata
File "/gpfs2/well/win/projects/ukbiobank/fbp/bb_pipeline_v_2.0/bb_python/bb_python/lib/python3.5/site-packages/sklearn/utils/stats.py", line 2, in <module>
from scipy.stats import rankdata as _sp_rankdata
File "/gpfs2/well/win/projects/ukbiobank/fbp/bb_pipeline_v_2.0/bb_python/bb_python/lib/python3.5/site-packages/scipy/stats/__init__.py", line 344, in <module>
from .stats import *
File "/gpfs2/well/win/projects/ukbiobank/fbp/bb_pipeline_v_2.0/bb_python/bb_python/lib/python3.5/site-packages/scipy/stats/stats.py", line 176, in <module>
from . import distributions, mstats_basic, _stats
File "/gpfs2/well/win/projects/ukbiobank/fbp/bb_pipeline_v_2.0/bb_python/bb_python/lib/python3.5/site-packages/scipy/stats/distributions.py", line 13, in <module>
from . import _continuous_distns
File "/gpfs2/well/win/projects/ukbiobank/fbp/bb_pipeline_v_2.0/bb_python/bb_python/lib/python3.5/site-packages/scipy/stats/_continuous_distns.py", line 17, in <module>
from scipy._lib._numpy_compat import broadcast_to
File "/gpfs2/well/win/projects/ukbiobank/fbp/bb_pipeline_v_2.0/bb_python/bb_python/lib/python3.5/site-packages/scipy/_lib/_numpy_compat.py", line 10, in <module>
from numpy.testing.nosetester import import_nose
ImportError: No module named 'numpy.testing.nosetester'
```
which seems to be an error replicated in this [stack exchange posting](https://stackoverflow.com/questions/59474533/modulenotfounderror-no-module-named-numpy-testing-nosetester), identified as a 3-way conflict with particular versions of `numpy`, `scipy` and `sklearn`.
Notably, there is a difference in the versions specified in `bb_pipeline_v_2.0/initvars` on the repo and on rescomp. On the [`bb_pipeline_v_2.0`](https://git.fmrib.ox.ac.uk/falmagro/UK_biobank_pipeline_v_1) repo [bb_python/python_installation/install_bb_python.sh](https://git.fmrib.ox.ac.uk/falmagro/UK_biobank_pipeline_v_1/-/blob/master/bb_python/python_installation/install_bb_python.sh) only installs 10 packages, while the same `install_bb_python.sh` file on Rescomp installs 62 packages. While *both* have `numpy`, `scipy` and `sklearn` pinned versions that shouldn't create this problem, subsequent installs must bring up the version.
```
numpy repo pin: 1.11.1
numpy repo version after all installs: 1.12.0
numpy Rescomp pin: 1.12.0
numpy Rescomp version after all installs: 1.18.1
```
For scipy and scikit-learn, repo and rescomp versions are the same, and the pinned and final install also the same (0.18.0 and 0.17.1, respectively).
*Once* I switched to using my own `bb_python` (using only the 10 packages) I no longer get that error. Instead I get another error :( one that I have less ability to diagnose.
```
(bb_python) [kfh142@rescomp1 gen_STRUCTMOTION]$ ./script_predict.py
Traceback (most recent call last):
File "./script_predict.py", line 36, in <module>
prediction = final_model.predict(test_data_norm)
File "/gpfs2/well/nichols/shared/UK_biobank_pipeline_v_1/bb_python/bb_python/lib/python3.5/site-packages/sklearn/utils/metaestimators.py", line 37, in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
File "/gpfs2/well/nichols/shared/UK_biobank_pipeline_v_1/bb_python/bb_python/lib/python3.5/site-packages/sklearn/grid_search.py", line 435, in predict
return self.best_estimator_.predict(X)
File "/gpfs2/well/nichols/shared/UK_biobank_pipeline_v_1/bb_python/bb_python/lib/python3.5/site-packages/sklearn/linear_model/base.py", line 200, in predict
return self._decision_function(X)
File "/gpfs2/well/nichols/shared/UK_biobank_pipeline_v_1/bb_python/bb_python/lib/python3.5/site-packages/sklearn/linear_model/base.py", line 185, in _decision_function
dense_output=True) + self.intercept_
File "/gpfs2/well/nichols/shared/UK_biobank_pipeline_v_1/bb_python/bb_python/lib/python3.5/site-packages/sklearn/utils/extmath.py", line 184, in safe_sparse_dot
return fast_dot(a, b)
ValueError: shapes (45480,9) and (11,) not aligned: 9 (dim 1) != 11 (dim 0)
```
which I assumed is linked to PREDICT_DATA having the wrong number of columns. I suspect there's a SGE error lurking here that I'm still tracking down.