La Vita è Bella

2007-05-14

Miao: (will be) a new smart pinyin input method for Mac OS X

Why another pinyin input method?

The answer is simple: the current pinyin input methods on Mac doesn't satisfy me. I'm going to scratch my own itch :P

ITABC (comes with Mac) and Pinyin Module in OpenVanilla isn't a smart input method; QIM is smart but it's a shareware, and a expensive shareware (USD 19.99 or RMB 69.00, but only for one major version free update); FIT is smarter than ITABC and free (for charge), but is not smart enough and I don't like some of its feature but can't turn them off.

Miao will be a smart pinyin input method, designed for people who likes to input a whole sentence once. It will learn user's accustoming and finally become a essential tool in inputing (after some training).

The idea and feature

I won't focus on pre-train dictionaries, just like Sogou Pinyin or Google Pinyin. My focus will be on "learning"

For each sentence you input, it will record the words within the sentence, including continuos words and non-continuos words, and calculated for probability.

Continuos word probability will be used to auto-learn new words, that is, if there's a high probability that two known words will comes together, they will be formed as a new word.

Non-continuos word probability will be used for candidate word sorting. The word with the highest probability will get ordered first. As a instance, for the pinyin word "yaoming", if the sentence comes with "huojian"(火箭/Rockets), "maidi"(麦迪/McGrady), it's more likely to be "姚明"(Yao Ming the NBA player); If the sentence comes with "can"(惨/misery), it's more likely to be "要命"(Kill me!).

The license

It may be licensed under GPL or BSD license, it's not decided yet. But I can assume that it will be certainly free, as freedom!

About the name

"Miao" is the pinyin of the Chinese character "", which just like "meow" in English, is "the characteristic crying sound of a cat" (from Oxford American Dictionaries).

It's a pinyin input method, so I choose a pinyin word as its name :P

Other informations

I should write it as a module for OpenVanilla, instead of stand-alone. The reason is that I'm not going to reinvent the wheel. I want to focus on the learning algorithm, but not the system interfaces. And this may make it easier to port into other platforms, for example Linux.

The project homepage will be http://oaim.yhsif.com. I'll start it soon, any contribution, e.g. codes, ideas or artworks, are appreciated.

16:30:55 by fishy - Permanent Link

May the Force be with you. RAmen