Overview
This is a script for converting strings from NFD (Normalization Form Decomposition) to NFC (Normalization Form Composition) using Google Apps Script.
Description
Here, I would like to introduce a script for the unicode normalization using Google Apps Script. There are the characters with ゙
which is the voiced dot and the characters with ゚
which is the semi-voiced dot in Japanese language. When these are used for some applications, there are 2 kinds of usages for the character. For example, when for は
(\u306f
) HA with the voiced dot, there are ば
and ば
. These unicodes are \u3070
and \u306f\u3099
. Namely, there are the case which displayed 1 character as 2 characters. In most cases, the characters like \u3070
are used. This called NFC (Normalization Form Composition). But we sometimes meet the characters like \u306f\u3099
. This called NFD (Normalization Form Decomposition). When the document including such characters which are displayed as 2 characters is converted to PDF file, each character is separated like は ゙
. So users often want to convert the characters constructed by 2 characters to the single characters. Recently, String.prototype.normalize was added at ES2015. But ES2015 cannot be used at Google Apps Script yet. And although I had looked for the scripts like this for GAS, unfortunately, I couldn’t find. So I created this script.
The detail information and how to get this are https://github.com/tanaikech/ConvertNFDtoNFC.