Suppose $`\mathbf{X}^{H}, \mathbf{X}^{L}`$ are $`N \times V`$ feature matrices (e.g. connectivity between $`N`$ thalamus voxels and $`V`$ whole brain voxels). Note that they can have different dimensions in practice. To keep notations uncluttered, we suppose the number of voxels in high- and low-quality images are the same for a given subject. Now we assume $`\mathbf{X}^{H}, \mathbf{X}^{L}`$ share the same latent variable $`Y`$, which is a $`N \times K`$ binary matrix representing the voxels' classes.
For a single voxel, suppose $`\mathbf{y}_{n} \sim \text{multinomial}(\mathcal{\pi})`$, and $`p(\mathbf{x}^{L}_{n}|y_{nk}=1) = \mathcal{N}(\mu_{k}, \Sigma_{k}^{L})`$. To use high-quality data to inform the inference on low-quality data, we assume $`p(\mathbf{x}^{H}_{n}|y_{nk}=1, \mathbf{U}) = \mathcal{N}(\mathbf{U}\mathbf{x}^{H}_{n}|\mu_{k}, \Sigma_{k}^{H})`$.
For a single voxel, suppose $`\mathbf{y}_{n} \sim \text{multinomial}(\mathcal{\pi})`$, and $`p(\mathbf{x}^{L}_{n}|y_{nk}=1) = \mathcal{N}(\mu_{k}, \Sigma_{k}^{L})`$. To use high-quality data to inform the inference on low-quality data, we assume $`p(\mathbf{x}^{H}_{n}|y_{nk}=1, \mathbf{U}) = \mathcal{N}(\mathbf{U}\mathbf{x}^{H}_{n}|\mu_{k}, \Sigma_{k}^{H})`$ where $`\mathbf{U}^{T}\mathbf{U} = \mathbf{I}`$. The complete likelihood can be written as