Seems like this is a significant problem, as far as how to test.
I think a good solution needs two characteristics: 1. Not flag correct answers as incorrect (e.g. quizzing 行 - don't flag "ok" as incorrect just because you were looking for "profession") 2. Also ensure that the full range of definitions get exercised (same example, don't let the user just enter "ok" and never "profession")
The simplest way I can think of to do this is much like what happens when one types the correct pinyin when Memrise is actually asking for the definition. Something to the effect of "Right, but we were thinking of another meaning that you've learned for this character. Try again."
The system could keep track of the two (or more) meanings like separate words, but just needs to be smart enough to realize when the user gives a correct answer other than the one it was trying to quiz.
Maybe not elegant, but this would be a big improvement as far as I'm concerned. Anything to put an end to endless quizzing of 看，行，星 and others where I get marked wrong at random because Memrise wants the other definition.