Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
FSL
funpack
Commits
03033acb
Commit
03033acb
authored
Dec 17, 2018
by
Paul McCarthy
🚵
Browse files
BF: Was testing column type frequency the wrong way around
parent
99f38091
Changes
1
Hide whitespace changes
Inline
Side-by-side
ukbparse/fileinfo.py
View file @
03033acb
...
...
@@ -113,9 +113,9 @@ def has_header(sample,
# if more than two thirds of rows
# have a different type to the first
# row, let;s sat we have a header.
threshold
=
collections
.
defaultdict
(
lambda
:
0.
66
)
threshold
=
collections
.
defaultdict
(
lambda
:
0.
34
)
threshold
[
1
]
=
1.0
threshold
[
2
]
=
0.
49
threshold
[
2
]
=
0.
51
for
col
,
ctypes
in
coltypes
.
items
():
...
...
@@ -123,8 +123,8 @@ def has_header(sample,
hist
=
collections
.
Counter
(
ctypes
)
thres
=
threshold
[
len
(
ctypes
)]
if
hist
[
t0
]
/
len
(
ctypes
)
>
thres
:
colcount
+=
1
else
:
colcount
-=
1
if
(
hist
[
t0
]
/
len
(
ctypes
)
)
<
thres
:
colcount
+=
1
else
:
colcount
-=
1
return
colcount
>
0
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment