Mandarin FAQ - Parts of Speech

I have a slight problem with how Memrise handles parts of speech in the mandarin courses. Actually, it’s a ‘not-so-slight’ problem. According to the mandarin FAQ, each unique character is assigned all the parts of speech that is associated with the character. ("Many Chinese words can be more than one part of speech. In this case the all, separated by semi colons." ) However, this does not make sense because of Memrise’s implementation. When we are introduced to a character, we are introduced to one specific meaning, for example, 家 (home). According to Memrise, this character is both a noun and measure word. But, is the definition I learned for the noun or for the measure word? What if I wanted to use the character as a measure word? Do I actually know how to use the character as a measure word? For example, is it a measure word for buildings, homes, families, groups of people, lumbar, windows, etc? This is just a simple example. Another example is this character: 短 (duan3) which memrise has classified as an adjective and noun. The user learns the adjective meaning “short”, but is that the meaning of the noun as well? The ABC Dictionary (Defrancis/Yanyin) shows an entry for the verb form (to lack, to owe) and the bound form is “weak point”. I’ve seen other characters with even more “parts of speech” associated with them. In some Chinese dictionaries I’ve seen, they differentiate the multiple interpretations of the character by part of speech explicitly, ex: (家/1307102). Others just list out all the possible meanings of the character. I would prefer using English/Romance language parts of speech where possible, but that’s just because I don’t want to get into free morphemes vs. bound morphemes, etc. For more information on that, here’s a link. In other languages (ex: Spanish), some words are differentiated by their part of speech and multiple entries exist. One example is the word ‘tarde’. If the part of speech is an adverb, I enter “late”, if it’s a noun, I enter “afternoon”. (I don’t need the definite article (“la”), because the part of speech determines how I should interpret and use this word.) I don’t find the different definitions confusing because my mind associates specific meanings based on the word’s part of speech.
Could memrise, split up a character’s meanings based on its part of speech (verb, adverb, pronoun, noun, adjective, measure word, preposition, radical, chengyu, phrase, etc.) and the specific definition associated with said part of speech? example – entry 1 -家, noun, home; entry 2; 家, mw, mw for companies/families If a character has multiple meanings that have the same part of speech, can we use some hint mechanism (e.g. verb 1) so that the user knows what to input?

Posted by DarthJen 5/20/12 , last update 5/25/12 (4 years ago)
  • Very interesting post, jennifer. My initial reaction is that I agree that the current system for parts of speech is confusing, and that more prominence needs to be given to alternative meanings. I'm not sure that separately teaching meanings according to parts of speech is the answer though.

    First, I'm not sure that English/Romance categorisations are up to the job of classifying Hanzi in a non-confusing way. (Although I read the Rutgers slides, I'm not sure I understand your free/bound morphemes reasoning - maybe you could elaborate?)

    Second and more important, the meaning of a character is usually the product of character combinations and/or the context of use. In real life, we will learn which meaning to assign a character according to these factors, not a categorisation of parts of speech.

    I think therefore that testing should move towards simulating those conditions: more teaching-in-context (sentences, examples, fill in the blanks, etc). As users do this, they ought to progressively move away from English meanings altogether - after all, they can only be imperfect approximations of their Chinese meanings, and the sooner we can think of a character without thinking of it in translation, the better.

    Posted by Azimuth 5/20/12 (4 years ago)
  • The problem is that you only learn one meaning of the character. Therefore, in your example, if memrise wants to test me on the measure word meaning in a combination sentence - I would not have a clue what to do, even though I recognize the character. I just failed your test that you suggested. There was no opportunity for me to learn the alternate meanings. Do you see the problem? (This is theoretical because I actually do know this measure word apart from Memrise.) I So - all I learned was one meaning - home. I'm glad that it's listed in the pos but I don't think many of us actually look at the part of speech field except to differentiate specific characters such as 着. I mean, when a character is a noun, verb, particle, radical, heavenly branch, measure word, it gets absurd. Especially when the meaning we learn is say " to fire" or "home". Wouldn't it be easier to bind that specific meaning to a specific part of speech, a verb or a noun as identified in English. Then introduce the other meanings, such as a measure word or adverbs.

    I would have preferred to use their (Chinese) grammer terms for their language, but the jump is hard to make for me and doesn't seem to provide useful information. For example, the bound morpheme just means that the character does not stand along (like the word friend peng2 you). You don't say peng2. (At least that's how I interpreted the slide.) There are other parts of speech that are the same - such as nouns and verbs. It's only the more advanced chinese grammer that I would prefer not to learn just yet.

    How memrise categorizes characters is a sloppy implementation and people shouldn't passionately defend it just because that's the way it's done or because someone that they respect and admire created it. Sometimes, people need to think radically and identify the root problem. Look at your other sources and you will find that they too, break down characters by meaning.
    This is the entry at mdbg - (家). Look at it (particularly how they list the different meanings), look at other dictionaries and learning sources. Than come back and tell me honestly that you like how memrise implements learning characters and their (one) meaning.

    I agree with the last paragraph, but one must walk before one crawls. If memrise chooses to teach only one meaning per character they do a disservice to the user. This isn't french or spanish or english, and chinese characters are multifunctional. Let's treat them like chinese characters and not a poor man's version of other language's words.

    (With the merge process, characters that show one meaning would most likely be merged into the dominant character (i.e. the one with the string of all the parts of speech). This would create loss of information because I would love to have a course that looks only at measure words (not the kevin shan one) or only verbs, or only nouns, but, after a merge, it would become irrelevant.)

    Posted by DarthJen 5/21/12 (4 years ago)
  • Good points. Testing a character on alternative meanings (e.g. by means of sentences etc) without teaching those meanings first probably isn't the way to go. However, I think that the best way to teach these is through presenting them in some (easy) context rather than tying their recognition to categories like parts of speech, because that's how we will encounter them in using the language.

    I don't think the MDBG example is much of an improvement: it's just many, many different examples of 家 as a measure word, one use of 家 as a surname, and then one entry that lists all the other alternative meanings.

    The merging point is also a good one. Memrise should be able to recognise particularly useful categories like measure words, no doubt, and you should be able to create courses with reference to those. I hope that the upcoming wiki structure will be able to accommodate this. Perhaps it would also be useful if we were able to tag alternative meanings with 'parts of speech' identifiers (my preference would be for the Chinese parts of speech, yours perhaps not - an issue that would need to be decided upon).

    Posted by Azimuth 5/21/12 (4 years ago)
  • This is a very interesting discussion, and one that we need to get better at dealing with. I totally agree that teaching just one meaning per character is not the right way to go. But there are extra complexities with separating items into many items: the relationships between those items means that the memories that you form will have different characteristics than will the memories for totally new words.

    So, for example, if you learn the character 行 first as "ok", and then later were to learn "profession" as a separate item, it would be annoying to have to spend time getting that item through the greenhouse, and then growing it up to a full-sized plant. The other meaning can be learned much faster, and it should not be treated as a wholly new item.

    This can be solved in a number of ways. We are going to be making the scheduling much more adaptive according to how difficult individual items are to learn, and so requiring tests more or less often. This is on the way, and would help with this if we did use separate items.

    But I think that once you have learned the first meaning of a word, a better approach is to then build up a richer understanding through the use of the word in context. So, in the example of the measure word, once you reach a certain level of familiarity with the word, it could trigger the first "alternative meaning sentence" presentation. This would show the character being used as a measure word. You would also be able to see the comments thread where people could explain what this different use is. Then later on you would be tested on that use.

    More alternative meanings could keep on being introduced as you go along; but introduced through sample sentences, rather than by learning extra single word translations. The concepts that Chinese words represent are often very different to the concepts in English, so learning all the possible English words that can fit the Chinese one is not a very efficient or effective process. I think. Does that sound sensible?

    With regards to the part of speech, this is a tricky one. I personally find it a really painful way to categorise Chinese words. Even the part of speech categorisations that are used by the Chinese seem to be largely borrowed from Western linguistics (I say "seems" because I don't know this for a fact, but they do seem to be trying to ram the square peg of the Chinese language into the round hole of western grammatical terms, and then adding on a few more just to cover the bases.) I find that constantly telling me that a verb can be a noun and a verb and that the meanings are translated a bit differently is not particularly helpful: if you just get used to seeing the word in different contexts, you get a feel for how it is used. Whether it is serving as a particular part of speech or not seems, to me, to be more confusing than helpful. In some ways I would rather just hide the parts of speech for Chinese, but this does also hold annoyances in certain places and is not an ideal solution.

    Being able to keep certain duplicates in the database is something that will become possible more easily very soon - it is not in v1 of the wiki that is on the site now, but it will come soon after.

    I hope that makes some sense, and I look forward to further discussion and ideas for how we can improve!

    best wishes


    Posted by BenWhately Staff   5/22/12 (4 years ago)
  • Sounds great. Regarding sentences, I wonder if it will be possible to choose between hanzi-only and hanzi+pinyin sentences. I find it hard to force myself to read hanzi when the pinyin is right alongside. Ideally, you would be able to opt for hanzi-only sentences, with pinyin accessible by clickable link ('show pinyin' etc) or by tooltip.

    Posted by Azimuth 5/22/12 (4 years ago)
  • The default is going to be to only have the Hanzi, and no pinyin, for the reason that you suggest; if pinyin is there then it is very hard not to read it. You will get the audio for the sentence after the test (not in v1, but soon) so that should reinforce the pronunciation. We will have to test and tweak it to get it absolutely right though.

    Best wishes


    Posted by BenWhately Staff   5/23/12 (4 years ago)
  • The existing structure got the job done as far as getting courses up and running, but the issue is sticky even if one were studying English. Let's say you want to make a course that included the word "dry" in the metaphorical sense (dry humor, dry wit, he is very dry). The same issue pops up.

    I think under each word there have to be totally separate meanings, and courses should choose and test a specific meaning, not a word. And in Mandarin, specific meanings may have different pronunciations, mems, etc. They also would have different confusables and be tested differently. If you want to test me on whether I know the right measure word for a horse, the question may be 一 (blank) 马? We don't even have to have English or pinyin to test it. So each meaning can even have a setting on what kinds of questions are appropriate for it.

    Meanings also should be able to be labeled so we can mark which ones no one really says but are traditional meanings that will help us understand (component meanings?).

    Posted by ThatHorse 5/23/12 (4 years ago)
  • @ThatHorse I disagree to some extent. Initial learning should be recognition of a character and tying that character to a pronunciation and meaning. Which meaning? It depends. Generally I would say it should be the meaning that gives the most insight into the range of meanings taken on by the character, but it could also be simply the most common meaning taken by the character, or - particularly in the case of characters and components that don't appear on their own - the meaning that best lends itself to vivid mems.

    Once you've done that, you can expand your knowledge of that character to include other meanings, and this is best done through contextual learning.

    As per your example, once you've learnt one meaning of 'dry' - the literal meaning - it's much easier to expand into other, more figurative meanings. In Chinese, it would be more efficient to learn 干燥 gānzào as 'dry' first, and then learn through context that it can also be used to mean 'dull' or 'uninteresting', rather than learning both meanings separately and without context.

    However, I do agree that it could be useful to be able to make lists that test specific meanings - i.e., to specify that the user should be tested not on the default meaning of the character, but a specific one (although other meanings should remain acceptable answers). However, what is to prevent fragmentation across word-lists, with some lists asking for one meaning, and others asking for another? It might get very confusing for users as you could pick a wordlist to learn without knowing what meaning would be specified.

    Your last point is really important. We need a good way to indicate formal/informal, written/oral, and what you call component meanings (etymological meanings, maybe? Hmm.) Maybe other tags/labels as well. Of all the dictionaries I use, only Tuttle's Learner Dictionary on the Pleco app is consistently good at indicating this kind of information. memrise could be uniquely useful here.

    Posted by Azimuth 5/23/12 (4 years ago)
  • Look - I'm just looking at this as a IT person with a focus in databases and some experience with what is and isn't practical to implement as programming. It is possible to create a "measure/classifier" word test if you control the parts of speech and associate the proper measure word with the correct noun. For example one bank, one shirt, three books, seven knives, 14 photo albums. If memrise sets up the database/wiki properly, it will have the flexibility to implement this sort of test and other tests. I have a personal connection who is going to make something like the measure word test I mentioned above for me :-) As it is, I'll probably create some spreadsheets and make them private so as not to lose the data to the greater memrise wiki and then use the custom made app to really set the measure word/noun connection in my mind once I've memorized enough nouns and measure words.

    Posted by DarthJen 5/23/12 (4 years ago)
  • BTW - @ Ben W - I agree with you, the way characters are defined seems awkward and painful (especially when using european based grammer terms). I mean - Wo3 hen3 hao3 - hen3 hao3 may be an adjective but it has properties that are distinctly verb like. The problem is that I haven't found any efficient and rational means of classifying the characters except by classifying them through their definitions (i.e. to drink (verb), lamp (noun), therefore (adverb)).

    Posted by DarthJen 5/23/12 (4 years ago)
  • @Azi I didn't mean to propose learners start out learning multiple meanings. I'm only proposing that our current "word lists" should be redefined as "meaning lists", i.e. each word listed in there is, actually like you say, "one meaning" of the word. Basically we just have to relable things like this:

    current words = single meaning current lists = lists of those (prev. called words) meanings new structure = "word", a collection of those meanings

    The only new work would be reducing the current words to single meanings, and letting people start grouping them together as structures.

    Meanings also could have types, e.g. measure word, and types would then have their own optimal ways to be tested. This would resolve @jenn's concerns if I understand her correctly.

    Look at 得 for example. It has two different pronunciations as a verb with two different meanings. It also has a different meaning as a particle. In the system I'm proposing, all three of those would be separate "words", but let's properly re-label that as "meanings." A beginner word list would perhaps choose one of those three meanings to include, perhaps the 我说得不好 one?

    Posted by ThatHorse 5/24/12 (4 years ago)
  • @ThatHorse - it would resolve my concern. It would also make memrise's database more robust and flexible. It's a win/win situation if they implement it. They could even do more advanced features such as automatic sentence generation if they clean up the database.

    Posted by DarthJen 5/24/12 (4 years ago)
  • So if I get you right, you are basically asking for a relationship to be made that links together different "meanings" of the same underlying character, or "word", is that right? This would indeed help in a lot of places, and we are planning to introduce it. It is going to take a little while to get finished though, sorry about that. But it will come!

    best wishes


    Posted by BenWhately Staff   5/24/12 (4 years ago)
  • (Coming from a newbie so don't be offended if my inexperience shows- I don't know all the functionalities yet)

    You shouldn't try to make memrise do things it's not designed to do. I've looked at the dictionary entry Jennifer posted and frankly, NO, it's not an improvement over memrise, it's completely overwhelming! The reason why memrise works so well in my opinion is because you're only learning one thing at a time, for example correspondence btw a character and its basic meaning, or at most 2 things (character- meaning, and character-pronunciation). If you want more information, there already are boxes (etymology/ samples/ ...) where you can learn/add mems to show what the other meanings or uses of the character are. Little by little, as people add mems, the entries for each character will get more and more complete, and depending on what your level or interests are, you can choose to be satisfied with the basic meaning or to deepen your understanding of the character.

    That said, you could include metaphorical meanings (as in the example with English 'dry') in separate entries, not for 'dry' itself, but you can create one for 'dry wit'. To some extent, that's already what is done is Chinese when we learn compounds that are made up of characters we've already learnt.

    The same could be done with measure words: you could create word entries like "one company", "one horse" (I don't know the measure word for 'horse', not that it matters at my level). The only problem I guess (I'm thinking as I write) would be that when you are tested, since the system only includes actual examples it would not supply 'wrong' answers and the right one would be too easy to find...

    Anyway, I don't think memrise is going to teach you grammar or usage. It's great for the basic building blocks, but the greater feel for the language will only come from listening and reading, and actually interacting with people (at least I hope so).

    Posted by mariepi70 5/24/12 (4 years ago)
  • @Ben - what if you had this -工具书 (gong1 ju4 shu1 - a reference book) - this is a noun. I'm going to make some assumptions now about the table structure - row id, character, meaning, part of speech, sound 1, sound 2, sound 3, course name. What if you added another attribute, classifier/measure word? This could only be useful if you parse the characters and divide them by their meaning's parts of speech. For example, this character - 贼 (thief), a noun. This character also means evil, cunning, and especially. Have separate entries for those meanings (adj, adj, adverb), but for the noun, you could link up the noun's meaning/character with its measure word - 帮 (group / gang / party, src = Once your database is set up properly, you could start having fun. How hard would it be to make a dynamic measure word test? The system selects a noun class character and the user needs to select the measure word for that noun from a multiple choice style screen. Or how about creating a phrase - random number generator to create a chinese number string, measure word, and then the user needs to select the character that is a noun that is classified by this measure word. This could be generated from just simple database queries. If you got a programmer, you could program it so that it generates grammatically correct phrases and then the user could (theoretically) select the best translation from a multiple choice style test.
    This kind of test requires the user to apply the rules of which nouns are covered by which measure words without memorizing fixed phrases. (That’s my main objection to the other measure word course – I have a feeling that I would memorize 1 cup of coffee, but I would not automatically transfer that understanding of the measure word to other phrases such as 3 cups of black tea.) To all others who don’t want to create multiple entries to separate the meanings based on part of speech. I’m not asking you to re-learn the character. I’m just saying that you could become gradually aware of other meanings associated with the character when you seed them. (The fact is that when you learn (短) short, you are learning an adjective meaning, you are not learning the verb meaning. Therefore, that entry should not even be identified as being a verb. It’s like classifying a human being as a collection of bacteria. Just because most of our mass is composed of bacteria doesn’t make us bacteria.) I would not want to learn all the meanings of a character the first time I’m exposed to it. I wouldn’t want to be exposed to new meanings for a while. But afterwards, when I consistently correctly answer tests on this word, couldn’t I learn a new meaning? Would that really affect my previous understanding of this character? For example – look at the word “boot”. I’m sure that if you think about it, you have associated multiple meanings with this sequence of letters. For example, this is what comes to my mind, footwear, trunk (part of a car), give the boot to someone, booting up a computer. How I use the sequence of words, drives how I interpret the meaning. In no way am I confused about the meaning of “boot”. @mariepi70 – memrise has already stated that it’s going to create some sentence style testing scenarios. That will, by definition, force the user to learn grammer. For example, you can’t say this exactly in mandarin “I will eat dry fried beef with chilies at that restaurant tomorrow”. The sentence would require that I move the time and location to the front of the sentence – “Tomorrow, at that restaurant, I eat dry fried beef with chilies.” I couldn’t also say “I am fine” when asked “how I’m doing”. I would say, “I good”. You don’t use the “to be” verb in front of adjectives. What I’m suggesting would not really affect you. You would still learn the characters, only this time, you would associate specific parts of speech with a specific meaning. Memrise wouldn’t be stupid enough to overwhelm a user with multiple definitions. They, (Ben), have repeatedly given the impression of wanting a one word / one meaning association.

    In addition, the mems are nice for the initial introduction. But I don't really use them later on and I tend to forget most of them. Even the informative ones (sorry Tasker D. who makes very informative mems). The fact is, we learn what we practice. If the information is on a mem and I'm not tested on it, I will forget it.

    What I’m requesting is rational and highly logical. This is just good database design so that Memrise can have the flexibility to do other things later IF they want to.

    I guess I'm coming at this from a very strong IT background that intersects with an interest in language. Modeling data/real word scenarios, organizing data (because data is useless, organized data = information = useful) are things I'm very comfortable with. I think I'm subconsciously looking at how memrise is set up and noticing things that a good db admin/programmer/it professional would never allow.

    Posted by DarthJen 5/24/12 (4 years ago)
  • I think there is quite a lot of convergence about what we all want from memrise (initial learning of one meaning, then increasing exposure and learning of others), though we may disagree on how to achieve it.

    I have one quick question for Ben: I've seen some words include the measure word in the 'special properties' field. I'm assuming that this is just one contributor's initiative, rather than an accepted convention? I guess that it'd be just as well to hold off on supplying measure-word info until the infrastructure supports a specific field or tags for them?

    Posted by Azimuth 5/24/12 (4 years ago)
  • Just to clarify about my suggestion, it should be relatively easy to implement. We simply have to re-label the existing word entries as meanings, encourage those few that have multiple meanings within them to be reduced to a single meaning and have the additional meanings "split" into new ones. And (later) allow the creation of a new structure called "words" that is allowed to group together multiple entries. Possibly we could even skip this step (perhaps by making a "view word" page simply search for things that have the same spelling, e.g. look up "只" word page and it simply lists all the entries prev. known as words on one page]. Almost no creation necessary.

    Posted by ThatHorse 5/24/12 (4 years ago)
  • @ThatHorse- to me the basic units in the memrise Mandarin courses I am doing are not words or meanings, it's characters. Sometimes they happen to be words, sometimes not. That's why the parts of speech often seem irrelevant. It's a bit like word roots in English: -fer- means "carry", but what part of speech is it? Or for an example more like Chinese: -morph- means "form", a root, no part of speech, but it can also be used as a word in its own right.

    Mandarin courses do include words too, usually as compounds. It's basically a way of seeing possible uses of characters in context, which is why I'm suggesting that a painless way of including some grammar is to include it in an expression. For measure words, that would be for example "one company", but it could also be used for verb endings like "qilai" (don't know how they're called): just create entries for frequent expressions using these structures (kanqilai, ting budong - sorry I haven't discovered how to type characters yet!!)

    @Jennifer- I didn't know memrise was going to test syntax but I say good luck to them! To my knowledge translation systems that work well are all based on probability of co-occurrence rather than actual rules of syntax. However I have no technical knowledge so I'll just wait and see.

    Posted by mariepi70 5/25/12 (4 years ago)
  • @ThatHorse - your implementation is exactly the way I would like it to be.
    @mariepi70 - expression testing would be nice - but the problem one confronts is how to interpret the sequence of characters. For example when I read sentences such as "Zhe4 shi3 Wang2 Xian1sheng" - "this is Mr. Wang" - my first impulse is to read "this is king mister" because the meaning I've associated with wang2 is king. BTW - knowledge translation systems would have the grammer rules programmed in. Otherwise a would perfect this be sentence. See :)? That sentence should have been "Otherwise, this would be a perfect sentence."

    Posted by DarthJen 5/25/12 (4 years ago)

Recent threads

This forum doesn't have any recent activity