Randomization probably can be a solution to this problem.
Let me first remind how EM works. There are two steps that are computed iteratively
- (Expectation) where we compute probability that each particular event belongs to each distribution
- (Maximization) where given the probabilities we maximize parameters of each distribution.
What if we sample events according to distribution from expectation step? At each stage we will attribute each event to one (in simplest case) component of mixture, or maybe several of them. This kind of randomization should prevent us from 'shrinking' of distribution.
Ok, this again needs time for experiments.