Commit 5db9fb32 authored by Ying-Qiu Zheng's avatar Ying-Qiu Zheng
Browse files

Update 2021JUL21.md

parent 7f969360
...@@ -18,12 +18,14 @@ The marginal distribution of $`\mathbf{x}_{n}^{L}, \mathbf{x}_{n}^{H}`$ is ...@@ -18,12 +18,14 @@ The marginal distribution of $`\mathbf{x}_{n}^{L}, \mathbf{x}_{n}^{H}`$ is
In summary, in addition to finding the the hyper-parameters $`\pi, \mu, \Sigma_{k}^{H}, \Sigma^{L}_{k}`$, we want to estimate a transformation matrix $`\mathbf{U}`$ such that $`\mathbf{UX}^{H}`$ is as close to $`\mathbf{X}^{L}`$ as possible (or vice versa). In summary, in addition to finding the the hyper-parameters $`\pi, \mu, \Sigma_{k}^{H}, \Sigma^{L}_{k}`$, we want to estimate a transformation matrix $`\mathbf{U}`$ such that $`\mathbf{UX}^{H}`$ is as close to $`\mathbf{X}^{L}`$ as possible (or vice versa).
### Simulation results ### Simulation results
#### Methods #### We considered three scenarios
##### Low-quality data noisier than the high-quality data ##### I. Low-quality data noisier than the high-quality data
We simulate the case where the features of low-quality data are noiser than those of the high-quality data. The number of informative features remains the same, however. We simulate the case where the features of low-quality data are noiser than those of the high-quality data. The number of informative features remains the same, however.
```julia ```julia
noise_level = 10 noise_level = 10
d = 3
# the high- and low-quality share the same cluster centroid # the high- and low-quality share the same cluster centroid
# there are two clusters, and d features
XHmean = hcat(randn(Float32, d), randn(Float32, d)) XHmean = hcat(randn(Float32, d), randn(Float32, d))
XLmean = copy(XHmean) XLmean = copy(XHmean)
n_samples = 1000 n_samples = 1000
...@@ -37,11 +39,11 @@ for c ∈ [1, 2] ...@@ -37,11 +39,11 @@ for c ∈ [1, 2]
XLtrain[:, findall(x -> x==c, class)] .= rand(MvNormal(XLmean[:, c], 0.05f0 * noise_level * I), count(x -> x==c, class)) XLtrain[:, findall(x -> x==c, class)] .= rand(MvNormal(XLmean[:, c], 0.05f0 * noise_level * I), count(x -> x==c, class))
end end
``` ```
##### Low-quality data noisier than the high-quality data with less informative features ##### II. Low-quality data noisier than the high-quality data with less informative features
In this scenario, low-quality data has less informative features In this scenario, low-quality data has less informative features
```julia ```julia
noise_level = 10
# the high- and low-quality share the same cluster centroid # the high- and low-quality share the same cluster centroid
# there are two clusters, and d features
XHmean = hcat(randn(Float32, d), randn(Float32, d)) XHmean = hcat(randn(Float32, d), randn(Float32, d))
XLmean = copy(XHmean) XLmean = copy(XHmean)
# 50% of the original features are non-informative # 50% of the original features are non-informative
...@@ -57,10 +59,10 @@ for c ∈ [1, 2] ...@@ -57,10 +59,10 @@ for c ∈ [1, 2]
XLtrain[:, findall(x -> x==c, class)] .= rand(MvNormal(XLmean[:, c], 0.05f0 * noise_level * I), count(x -> x==c, class)) XLtrain[:, findall(x -> x==c, class)] .= rand(MvNormal(XLmean[:, c], 0.05f0 * noise_level * I), count(x -> x==c, class))
end end
``` ```
##### Low-quality data noiser than the high-quality data with 10% outliers ##### III. Low-quality data noiser than the high-quality data with 10% outliers
```julia ```julia
noise_level = 10
# the high- and low-quality share the same cluster centroid # the high- and low-quality share the same cluster centroid
# there are two clusters, and d features
XHmean = hcat(randn(Float32, d), randn(Float32, d)) XHmean = hcat(randn(Float32, d), randn(Float32, d))
XLmean = copy(XHmean) XLmean = copy(XHmean)
n_samples = 1000 n_samples = 1000
...@@ -77,5 +79,9 @@ end ...@@ -77,5 +79,9 @@ end
XLtrain[:, rand(1:n_samples, Int(round(n_samples / 10)))] .= randn(d, Int(round(n_samples / 10))) .* 2 XLtrain[:, rand(1:n_samples, Int(round(n_samples / 10)))] .= randn(d, Int(round(n_samples / 10))) .* 2
``` ```
#### Results #### Results
- d = 3
![res1](/figs/2021JUL21/d3.svg)
- d = 10
![res2](/figs/2021JUL21/d10.svg)
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment