- Display pinyin for both Simplified and Traditional Chinese
- The scope of Simplified Chinese characters is based on the Table of General Standard Chinese Characters (通用规范汉字表)
- The scope of Traditional Chinese characters is based on the Big-5-2003 (五大碼-2003)
- The scope of Japanese Kanji is based on the Jōyō_kanji (常用漢字表(平成22年内閣告示第2号))
- Hiragana (e.g.:あ) and katakana (e.g.:ア) are available
The font used here is based on Source-Han-TrueType.
This is a TTF version of Source Han Sans/Source Han Serif with reduced file size. All required Chinese characters are included.
M+ M Type-1's mplus-1m-medium.ttf is used for the pinyin part of this font.
The font used here is based on 小赖字体/Xiaolai Font.
And remove Hangul characters(a960 #ꥠ ~ d7fb #ퟻ) from this font to reduce glyphs.
SetoFontSP is used for the pinyin part of this font.
- macOS 10.15(Catalina)
- python 3.7
- otfcc
$ pyenv global 3.7.2
$ pip install -r requirements.txt
otfcc is lightweight and support IVS
jq can mangle the data format that you have into the one that you want with very little effort
# Install Xcode by mas-cli
$ mas install 497799835
# Note: Xcode initially gets an error because the [Command line Tools:] list box is blank.
# The following solutions will fix this problem.
# Refer to [エラー:xcode-select: error: tool 'xcodebuild' requires Xcode, but active developer directory '/Library/Developer/CommandLineTools' is a command line tools instance](https://qiita.com/eytyet/items/59c5bad1c167d5addc68)
# Install otfcc
$ brew tap caryll/tap
$ brew install otfcc-mac64
- Making a homograph dictionary(optional)
to details
$ cd <PROJECT-ROOT>/res/phonics/duo_yin_zi/scripts/
$ python make_pattern_table.py
- Make an unicode table of the target Chinese characters(optional)
to details
$ cd <PROJECT-ROOT>/res/phonics/unicode_mapping_table/
$ python make_unicode_pinyin_map_table.py
- Build the font
$ cd <PROJECT ROOT>
$ time python src/main.py --style han_serif
or
$ time python src/main.py --style handwritten
METADATA_FOR_PINYIN = {
"pinyin_canvas":{
"width" : 850, # The width of the canvas.
"height" : 283.3, # The height of the canvas.
"base_line": 935, # The height from the bottom of the Chinese character canvas to pinyin canvas.
"tracking" : 22.145 # Character spacing in the pinyin display area (Tracking is about uniform spacing across a text selection).
},
"expected_hanzi_canvas":{
"width" : 1000, # Expected Width of the Chinese character canvas.
"height": 1000, # Expected height of the Chinese character canvas.
}
}
refer to pinyin_glyph.py, config.py
glyf can be componentized and referenced. You can reduce the volume by reusing them, and since they are placed by affine transformation, you can easily set their size and position.
Reference usage examples:
"cid48219": {
"advanceWidth": 2048,
"advanceHeight": 2628.2,
"verticalOrigin": 1803,
"references": [
{
"glyph": "arranged_ji1", "x": 0, "y": 0, "a": 1, "b": 0, "c": 0, "d": 1
},
{
"glyph": "cid48219.ss00", "x": 0, "y": 0, "a": 1, "b": 0, "c": 0, "d": 1
}
]
},
The transformation entries determine the values of an affine transformation applied to the component prior to its being incorporated into the parent glyph. Given the component matrix [a b c d e f], the transformation applied to the component is:
In the reference, a-d is the value of the affine transformation. In this tool, using a,d (scale) and x,y (move).
Note: For unknown reasons, otfccbuild lost glyphs if a and d are the same value. If the sizes are different, it will be reflected, so set a=0.9, d=0.91 for 90%.
refer to pinyin_glyph.py
"aalt" is set to display the alternative characters.
- "aalt_0" is set to "gsub_single". In use case, a symbol character and when the pronunciation changes only one Chinese character.
- "aalt_1" is set to "gsub_alternate". In use case, When the pronunciation changes more than two Chinese characters.
"rclt" is used for homograph substitution. This feature is used for chaining contextual substitution
- "pattern one" is pattern of the pronunciation changes only one Chinese character.
- "pattern two" is pattern of the pronunciation changes more than two Chinese characters.
- "exception pattern" is pattern of the duplicates that affect phrases of pattern one or two.
to details
-
This font assumes horizontal writing only
-
The glyf table can only store up to 65536
-
The glyf table is large, save it as another json
-
Duplicately defined Chinese characters refer to the same glyph to reduce the number of glyphs.
(⺎:U+2E8E, 兀:U+5140, 兀:U+FA0C and 嗀:U+55C0, 嗀:U+FA0D ) -
The only font that can be used as a glyf is Fixed-width latin alphabet only
-
The json of the standard python library becomes bloated and slow when converted to dict, so use orjson
Refer to Choosing a faster JSON library for Python,
PythonのJSONパーサのメモリ使用量と処理時間を比較してみる -
ssNN range from ss00 - 20
Refer to Tag: 'ss01' - 'ss20' -
Chinese Pinyin is simplified in the glyf table (yī -> yi1)
-
Exclude the specific pronunciations(e.g: 呣 m̀, 嘸 m̄) as that is not included in unicode
-
overwrite.txt has been added phrase for various purposes
- Register Pinyin that can not be acquired by pypinyin
- Adjust the priority of pronunciation
- Add the pronunciation of the "儿" as "r"
- Add light tone(轻声), Integrate pronounce of the duplicate Chinese characters
- Exclude the specific pronunciations(e.g: 呣 m̀, 嘸 m̄)
-
IVS responds as follows:
code | Pinyin glyf |
---|---|
0xE01E0 | None. Chinese character only |
0xE01E1 | With the standard pronunciation |
0xE01E2 | With the variational pronunciation |
-
The correspondence between ssNN and Pinyin is as follows:
-> If you don't put the standard pronunciation in ssNN, GSUB will immediately return to the original state when reverting to the standard reading in cmap_uvs.
Therefore, prepare a glyph for reverting to the standard pronunciation in ss01.
Naming Rules | glyf type |
---|---|
hanzi_glyf | Chinese character glyf with the standard pronunciation |
hanzi_glyf.ss00 | Chinese character glyf without Pinyin. Pinyin can be changed by simply changing the IVS code. |
hanzi_glyf.ss01 | (When Chinese character has the variational pronunciation) Chinese character glyf with the standard pronunciation (duplicates with hanzi_glyf, but replaces it by overriding GSUB replacements) |
hanzi_glyf.ss02 | (When Chinese character has the variational pronunciation) After that, Chinese character glyf with the variational pronunciation |
- The name of the lookup table is free, but it obeys the following rules to reveal the reference source
lookup table name | reference source |
---|---|
lookup_pattern_0N | pattern one |
lookup_pattern_1N | pattern two |
lookup_pattern_2N | exception pattern |
- The order of 1~n in duoyinzi_pattern_one.txt follows marged-mapping-table.txt, If order is 1 as the standard reading. Is order sequence match with ss0N.
e.g.:
U+5F3A: qiáng,qiǎng,jiàng #强
1, 强, qiáng, [~调|~暴|~度|~占|~攻|加~|~奸|~健|~项|~行|~硬|~壮|~盗|~权|~制|~盛|~烈|~化|~大|~劲]
2, 强, qiǎng, [~求|~人|~迫|~辩|~词夺理|~颜欢笑]
3, 强, jiàng, [~嘴|倔~]
- lookup rclt summarizes the reading pattern by. rclt0 is "pattern one". rclt1 is "pattern two"。 rclt2 is "exception pattern".
- duoyinzi_pattern_two.json and duoyinzi_exceptional_pattern.json a notation similar to Glyphs and OpenType™ Feature File
- ignore tag specifies the phrase to be affected. And attach a single quote to a specific character that is affected. Refer to ignore tag in duoyinzi_exceptional_pattern.json.