Skip to content

Dictionary

MeCab uses two types of dictionaries:

  • System dictionary: A dictionary trained using a large corpus which contains a large number of common words and phrases.
  • User dictionary: A dictionary that allows you to add a small number of custom words or morphemes

For Korean language, there are several system dictionaries as follows:

System dictionary

python-mecab-ko uses mecab-ko-dic as default system dictionary. However, if you would like to use other system dictionaries, you can pass dictionary_path when initializing MeCab.

Note

When using other system dictionaries, they must already be compiled into binary files using mecab-dict-index.

For example, to use openkorpos system dictionary, first install openkorpos-dic using pip:

$ pip install openkorpos-dic

Then, initialize MeCab by passing dictionary_path as follows:

>>> import openkorpos_dic
>>> from mecab import MeCab
>>> mecab = MeCab(dictionary_path=openkorpos_dic.DICDIR)
>>> mecab.pos("아버지가방에들어가신다")
[('아버지', 'NNG'), ('가', 'JKS'), ('방', 'NNG'), ('에', 'JKB'), ('들어가', 'VV'), ('신다', 'EP+EF+VCP')]

If you would like to check which dictionary MeCab is currently using, inspect MeCab.dictionary property.

>>> mecab.dictionary
[Dictionary(path=PosixPath('openkorpos_dic/dicdir/sys.dic'), number_of_words=816283, type=<Type.SYSTEM: 0>, version=102)]

User dictionary

python-mecab-ko also supports user dictionary. When initializing MeCab, you can add multiple user dictionaries by passing user_dictionary_path as follows:

>>> from mecab import MeCab
>>> # mecab = MeCab(user_dictionary_path="nnp.dic") # When adding one user dictionary
>>> mecab = MeCab(user_dictionary_path=["nnp.dic", "nng.dic"]) # When adding multiple user dictionaries
>>> mecab.dictionary
[Dictionary(path=PosixPath('mecab_ko_dic/dictionary/sys.dic'), number_of_words=816283, type=<Type.SYSTEM: 0>, version=102),
 Dictionary(path=PosixPath('nnp.dic'), number_of_words=1, type=<Type.USER: 1>, version=102),
 Dictionary(path=PosixPath('nng.dic'), number_of_words=1, type=<Type.USER: 1>, version=102)]

Please refer to Custom Vocabulary documentation for instructions on how to create a user dictionary.