9va-pi ︰ 歲末跨年

聽聞今年牛津年度『風雲字』是

 

喜極而泣

 

,一個喜極而泣的表情符號。若問這是一個『字』嗎?或許凡是『傳情達意』之符號,都可為『文』。既然天下能識能用,為什麼不可以說是個『字』的呢!曾經米雕風光一時,

米雕是一種在大米上寫字,畫畫並配飾成的飾品。米雕興旺於2000年後,由哈爾濱一民間藝人創立(以前街頭藝人稱之為米上刻字,源起何時不詳)。民間有傳說,但無正史可考。傳說宋徽宗年間有一趕窮考秀才進京趕考,名落孫山,盤纏用完了饑渴之極突發奇想 ,當街用糯米在其上寫人名和一個福字,不想求字之人甚多,收入頗豐。一年後居然成為一名米糧商人,富錦還鄉。

 

米里乾坤

 

,那『米里乾坤』是否為藝術的耶?!就像有人說『歲末』,有人講『跨年』,莫非這『歲』與『年』竟成了異物,說講不到一塊去了乎!?

所以『形聲字譜』能形大千世界之狀,能象自然萬有之聲,也能譜寰宇眾生之情。將之用於『溝通彼此』豈非備矣哉。

 

 

 

 

 

 

 

 

 

 

 

 

9va-pi ︰ Gnu Speeh 編譯安裝

若想了解『 gnuspeech 』是什麼?最好聽聽官網怎麼講︰

What is gnuspeech?

gnuspeech makes it easy to produce high quality computer speech output, design new language databases, and create controlled speech stimuli for psychophysical experiments. gnuspeechsa is a cross-platform module of gnuspeech that allows command line, or application-based speech output. The software has been released as two tarballs that are available in the project Downloads area of http://savannah.gnu.org/projects/gnuspeech. Those wishing to contribute to the project will find the OS X (gnuspeech) and CMAKE (gnuspeechsa) sources in the Git repository on that same page. The gnuspeech suite still lacks some of the database editing components (see the Overview diagram below) but is otherwise complete and working, allowing articulatory speech synthesis of English, with control of intonation and tempo, and the ability to view the parameter tracks and intonation contours generated. The intonation contours may be edited in various ways, as described in the Monet manual. Monet provides interactive access to the synthesis controls. TRAcT provides interactive access to the underlying tube resonance model that converts the parameters into sound by emulating the human vocal tract.

The suite of programs uses a true articulatory model of the vocal tract and incorporates models of English rhythm and intonation based on extensive research that sets a new standard for synthetic speech.

The original NeXT computer implementation is complete, and is available from the NeXT branch of the SVN repository linked above. The port to GNU/Linux under GNUStep, also in the SVN repository under the appropriate branch, provides English text-to-speech capability, but parts of the database creation tools are still in the process of being ported.

Credits for research and implementation of the gnuspeech system appear the section Thanks to those who have helped below. Some of the features of gnuspeech, with the tools that are part of the software suite, tools include:

  • A Tube Resonance Model (TRM) for the human vocal tract (also known as a transmission-line analog, or a waveguide model) that truly represents the physical properties of the tract, including the energy balance between the nasal and oral cavities as well as the radiation impedance at lips and nose.
  • A TRM Control Model, based on formant sensitivity analysis, that provides a simple, but accurate method of low-level articulatory control. By using the Distinctive Region Model (DRM) only eight slowly varying tube section radii need be specified. The glottal (vocal fold) waveform and various suitably “coloured” random noise signals may be injected at appropriate places to provide voicing, aspiration, frication and noise bursts.
  • Databases which specify: the characteristics of the articulatory postures (which loosely correspond to phonemes); rules for combinations of postures; and information about voicing, frication and aspiration. These are the data required to produce specific spoken languages from an augmented phonetic input. Currently, only the database for the English language exists, though French vowel postures are also included.
  • A text-to-augmented-phonetics conversion module (the Parser) to convert arbitrary text, preferably incorporating normal punctuation, into the form required for applying the synthesis methods.
  • Models of English rhythm and intonation based on extensive researchthat are automatically applied.
  • “Monet”—a database creation and editing system, with a carefully designed graphical user interface (GUI) that allows the databases containing the necessary phonetic data and dynamic rules to be set up and modified in order that the computer can “speak” arbitrary languages.
  • A 70,000+ word English Pronouncing Dictionary with rules for derivatives such as plurals, and adverbs, and including 6000 given names. The dictionary also provides part-of-speech information to faciltate later addition of grammatical parsing that can further improve the excellent pronunciation, rhythm and intonation .
  • Sub-dictionaries that allow different user- or application-specific pronunciations to be substituted for the default pronunciations coming from the main dictionary (not yet ported).
  • Letter-to-sound rules to deal with words that are not in the dictionaries
  • A parser to organise the input and deal with dates, numbers, abbreviations, etc.
  • Tools for managing the dictionary and carrying out analysis of speech.
  • “Synthesizer”—a GUI-based application to allow experimentation with a stand-alone TRM. All parameters, both static and dynamic, may be varied and the output can be monitored and analysed. It is an important component in the research needed to create the databases for target languages.

tts-block-diagram

Overview of the main Articulatory Speech Synthesis System

 

Why is it called gnuspeech?

It is a play on words. This is a new (g-nu) “event-based” approach to speech synthesis from text, that uses an accurate articulatory model rather than a formant-based approximation. It is also a GNU project, aimed at providing high quality text-to-speech output for GNU/Linux, Mac OS X, and other platforms. In addition, it provides comprehensive tools for psychophysical and linguistic experiments as well as for creating the databases for arbitrary languages.

What is the goal of the gnuspeech project?

The goal of the project is to create the best speech synthesis software on the planet.

 

由於作者沒有 MAC OSX 的環境,此處僅僅依據 gnuspeechsa-0.1.5.tar.gz 內之INSTALL 文件,驗證樹莓派上的安裝如下︰

 

mkdir gnuspeech
cd gnuspeech/

# 取得軟體
wget http://ftp.gnu.org/gnu/gnuspeech/gnuspeechsa-0.1.5.tar.gz
tar -zxvf gnuspeechsa-0.1.5.tar.gz 

# 編譯及安裝
cd gnuspeechsa-0.1.5/
pkg_dir=PWD mkdir ../GnuspeechSA-build cd ../GnuspeechSA-build cmake -D CMAKE_BUILD_TYPE=Releasepkg_dir
make
sudo make install
sudo ldconfig

# 測試
./gnuspeech_sa -c $pkg_dir/data/en -p /tmp/test_param.txt -o /tmp/test.wav "He
llo world." && aplay -q /tmp/test.wav

 

有關這個程式的簡介,讀者可以參考

README

GnuspeechSA (Stand-Alone)
==========================

GnuspeechSA is a port to C++/C of the TTS_Server in the original Gnuspeech (http://www.gnu.org/software/gnuspeech/) source code written for NeXTSTEP.
It is a command-line program that converts text to speech.

This project is based on code from Gnuspeech SVN, rev. 672, downloaded in 2014-08-02. The source code was obtained from the directories:

nextstep/trunk/ObjectiveC/Monet.realtime
nextstep/trunk/src/SpeechObject/postMonet/server.monet

This software is part of Gnuspeech.

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the COPYING file for more details.

 

雖說 gnuspeech-0.9.tar.gz 的編譯需要 MAC OSX 的環境,但是其中有些重要文件值得一讀,因此也建議讀者取得︰

wget http://ftp.gnu.org/gnu/gnuspeech/gnuspeech-0.9.tar.gz

至於要怎麼玩,尚請讀者自行方便。作者亦是新手,或容改日再談的了。

 

 

 

 

 

 

 

 

 

 

9va-pi ︰ Gnu Speeh 初版發行

如果蟲鳴鳥叫是天生本能,那麼人類講話也是天賦自然。但是萬物發聲之『物理模型』卻很難建造。因此

Gnuspeech

Gnuspeech is an extensible text-to-speech computer software package that produces artificial speech output based on real-time articulatory speech synthesis by rules. That is, it converts text strings into phonetic descriptions, aided by a pronouncing dictionary, letter-to-sound rules, and rhythm and intonation models; transforms the phonetic descriptions into parameters for a low-level articulatory speech synthesizer; uses these to drive an articulatory model of the human vocal tract producing an output suitable for the normal sound output devices used by various computer operating systems; and does this at the same or faster rate than the speech is spoken for adult speech.

Design

The synthesizer is a tube resonance, or waveguide, model that models the behavior of the real vocal tract directly, and reasonably accurately, unlike formant synthesizers that indirectly model the speech spectrum.[1] The control problem is solved by using René Carré’s Distinctive Region Model[2] which relates changes in the radii of eight longitudinal divisions of the vocal tract to corresponding changes in the three frequency formants in the speech spectrum that convey much of the information of speech. The regions are, in turn, based on work by the Stockholm Speech Technology Laboratory[3] of the Royal Institute of Technology (KTH) on “formant sensitivity analysis” – that is, how formant frequencies are affected by small changes in the radius of the vocal tract at various places along its length.[4]

 

或許代表一種『聲音合成』的未來。其中『聲道』

Vocal tract

The vocal tract is the cavity in human beings and in animals where sound that is produced at the sound source (larynx in mammals; syrinx in birds) is filtered.

In birds it consists of the trachea, the syrinx, the oral cavity, the upper part of the esophagus, and the beak. In mammals it consists of the laryngeal cavity, the pharynx, the oral cavity, and the nasal cavity.

The estimated average length of the vocal tract in adult male humans is 16.9 cm and 14.1 cm in adult females.[1]

400px-Sagittalmouth

Sagittal section of human vocal tract

 

模型就是型塑萬物音聲特色的基礎。欣聞

 

Initial release of gnuspeech available

From: David Hill <drh-AT-firethorne.com>
To: Gnu Announce <info-gnu-AT-gnu.org>
Subject: First release of gnuspeech project software
Date: Mon, 19 Oct 2015 18:41:22 -0700
Message-ID: <AD48546B-E89C-4F7C-A2C5-D45D5C3C46A3@firethorne.com>
Archive-link: Article, Thread

gnuspeech-0.9 and gnuspeechsa-0.1.5 first official release

Gnuspeech is new approach to synthetic speech as well as a speech research tool. It comprises a true articulatory model of the vocal tract, databases and rules for parameter composition, a 70,000 word plus pronouncing dictionary, a letter-to-sound fall-back module, and models of English rhythm and intonation, all based on extensive research that sets a new standard for synthetic speech, and computer-based speech research.

There are two main components in this first official release. For those who would simply like speech output from whatever system they are using, including incorporating speech output in their applications, there is the gnuspeechsa tarball (currently 0.1.5), a cross-platform speech synthesis application, compiled using CMake.

For those interested in an interactive system that gives access to the underlying algorithms and databases involved, providing an understanding of the mechanisms, databases, and output forms involved, as well as a tool for experiment and new language creation, there is the gnuspeech tarball (currently 0.9) that embodies several sub-apps, including the interactive database creation system Monet (My Own Nifty Editing Tool), and TRAcT (the Tube Resonance Access Tool) — a GUI interface to the tube resonance model used in gnuspeech, that emulates the human vocal tract and provides the basis for an accurate rendition of human speech.

This second tarball includes full manuals on both Monet and TRAcT. The Monet manual covers the compilation and installation of gnuspeechsa on a Macintosh under OS X 10.10.x, and references the related free software that allows the speech to be incorporated in applications. Appendix D of the Monet manual provides some additional information about gnuspeechsa and associated software that is available, and details how to compile it using CMake on the Macintosh under 10.10.x (Yosemite).

The digitally signed tarballs may be accessed at

http://ftp.gnu.org/gnu/gnuspeech/
There is a list of mirrors at http://www.gnu.org/order/ftp.html and the site http://ftpmirror.gnu.org/gnuspeech will redirect to a nearby mirror

A longer project description and credits may be found at: http://www.gnu.org/software/gnuspeech/
which is also linked to a brief (four page) project history/component description, and a paper on the Tube Resonance Model by Leonard Manzara.
Signed: David R Hill
———————–
drh@firethorne.com

http://www.gnu.org/software/gnuspeech/

http://savannah.gnu.org/projects/gnuspeech

https://savannah.gnu.org/users/davidhill

 

,不過眼前恐得了解編譯安裝之法。

 

 

 

 

 

 

 

 

 

 

 

9va-pi ︰ 語音合成

什麼是『語音合成Speech Synthesis 的呢?維基百科上說︰

語音合成是將人類語音用人工的方式所產生。若是將電腦系統用在語音合成上,則稱為語音合成器,而語音合成器可以用軟/硬體所實現。文字轉語音(text-to-speech,TTS)系統則是將一般語言的文字轉換為語音,其他的系統可以描繪語言符號的表示方式,就像音標轉換至語音一 樣。

而合成後的語音則是利用在資料庫內的許多已錄好的語音連接起來 。系統則因為儲存的語音單元大小不同而有所差異,若是要儲存phone以及 diphone的話,系統必須提供大量的儲存空間,但是在語意上或許會不清楚。而用在特定的使用領域上,儲存整字或整句的方式可以達到高品質的語音輸出。 另外,包含了聲道模型以及其他的人類聲音特徵參數的合成器則可以創造出完整的合成聲音輸出。

一個語音合成器的品質通常是決定於人聲的相似度以及語意是否能被了解。一個清晰的文字轉語音程式應該提供人類在視覺受到傷害或是得到失讀症時,能夠聽到並且在個人電腦上完成工作。從80年代早期開始,許多的電腦作業系統已經包含了語音合成器了。

 

Speech Synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech computer or speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech.[1]

Synthesized speech can be created by concatenating pieces of recorded speech that are stored in a database. Systems differ in the size of the stored speech units; a system that stores phones or diphones provides the largest output range, but may lack clarity. For specific usage domains, the storage of entire words or sentences allows for high-quality output. Alternatively, a synthesizer can incorporate a model of the vocal tract and other human voice characteristics to create a completely “synthetic” voice output.[2]

The quality of a speech synthesizer is judged by its similarity to the human voice and by its ability to be understood clearly. An intelligible text-to-speech program allows people with visual impairments or reading disabilities to listen to written works on a home computer. Many computer operating systems have included speech synthesizers since the early 1990s.

A text-to-speech system (or “engine”) is composed of two parts:[3] a front-end and a back-end. The front-end has two major tasks. First, it converts raw text containing symbols like numbers and abbreviations into the equivalent of written-out words. This process is often called text normalization, pre-processing, or tokenization. The front-end then assigns phonetic transcriptions to each word, and divides and marks the text into prosodic units, like phrases, clauses, and sentences. The process of assigning phonetic transcriptions to words is called text-to-phoneme or grapheme-to-phoneme conversion. Phonetic transcriptions and prosody information together make up the symbolic linguistic representation that is output by the front-end. The back-end—often referred to as the synthesizer—then converts the symbolic linguistic representation into sound. In certain systems, this part includes the computation of the target prosody (pitch contour, phoneme durations),[4] which is then imposed on the output speech.

 

550px-TTS_System.svg

 

想起最早接觸是在『盲人電腦系統』中之『螢幕閱讀器』軟體裡。依稀記得那時聲音聽起來像 R2-D2 般的有電腦味。就像是『標音』調校的不好的

eSpeak

eSpeak is derived from the “Speak” speech synthesizer for British English for Acorn RISC OS computers which was originally written in 1995 by Jonathan Duddington.

A rewritten version for Linux appeared in February 2006 and a Windows SAPI 5 version in January 2007. Subsequent development has added and improved support for additional languages.

Because of infrequent updates for last few years several espeak forks had emerged on github.[3] After discussions on espeak’s discussion list,[4][5] espeak-ng fork managed by Reece Dunn was decided as a new canonical place of espeak further development.

Because of its small size and many languages, it is included as the default speech synthesizer in the NVDA open source screen reader for Windows, and on the Ubuntu and other Linux installation discs.

The quality of the language voices varies greatly. Some have had more work or feedback from native speakers than others. Most of the people who have helped to improve the various languages are blind users of text-to-speech.

 

據聞卡內基美隆大學 Carnegie Mellon University 的『歡宴』

Festival

Welcome to festvox.org
This project is part of the work at Carnegie Mellon University’s speech group aimed at advancing the state of Speech Synthesis.

  • 14th February 2015: Indic voice release (Hindi, Marathi, Tamil and Telugu) and on-line demos
  • 25th December 2014: A suite of new releases:

    There is a general script that shows what you need to download, compile and run to use these new versions.

The Festvox project aims to make the building of new synthetic voices more systemic and better documented, making it possible for anyone to build a new voice. Specifically we offer:

  • Documentation, including scripts explaining the background and specifics for building new voices for speech synthesis in new and supported languages.
  • Specific scripts to build new voices in supported languages, such as US and UK English.
  • Aids to building synthetic voices for limited domains
  • Example speech databases to help building new voices.
  • Links, demos and a repository for new voices

The documentation, tools and dependent software are all free without restriction (commercial or otherwise). Licencing of voices built by these techniques are the responsibility of the builders.This work is firmly grounded within Edinburgh University’s Festival Speech Synthesis System and Carnegie Mellon University’s small footprint Flite synthesis engine

This work has been supported be various groups including, Carnegie Mellon University, the US National Science Foundation (NSF), and US Defense Advanced Research Projects Agency (DARPA).

Requirements for building a voice
Note the techniques and processes described here do not guarantee that you’ll end up with a high quality acceptable voice, but with a little care you can likely build a new synthesis voice in a supported language in a few days, or in a new language in a few weeks (more or less depending on the complexity of the language, and the desired quality).You will need:

  • To read the documentation
  • A Unix machine (e.g. Linux, FreeBSD, Solaris, etc) with working audio i/o. This may work on other platforms but many scripts, perhaps unnecessarily, depend on Unix utilties like, awk, sed etc.
  • Installed versions of Edinburgh University’s Festival Speech Synthesis System and Edinburgh Speech Tools (distributed with Festival).
  • A waveform viewing/labeling program like emulabel distributed as part of Macquarie University’s EMU speech database system. Although automatic labeling software is included in festvox, a display tool is necessary for diagnosis and debugging.
  • Patience and care, and a little interest in the subject of speech technology.

 

語音合成軟體,已經很有人情味的了。這兩套軟體 raspbian jessie 都有,有興趣的讀者可以自行安裝玩玩 。

作者一路追蹤 M♪o 的步伐,終達探尋『關節‧接合‧清晰發音』

Articulatory synthesis

Articulatory synthesis refers to computational techniques for synthesizing speech based on models of the human vocal tract and the articulation processes occurring there. The shape of the vocal tract can be controlled in a number of ways which usually involves modifying the position of the speech articulators, such as the tongue, jaw, and lips. Speech is created by digitally simulating the flow of air through the representation of the vocal tract.

 

 

『形聲字譜』合成系統之玄機矣!!??

 

 

 

 

 

 

 

 

 

 

 

 

 

9va-pi ︰ 文本→講話

世界總在變化中。一個時代有她的音樂,創造自己的語言。據說『上古漢語』聲韻的研究︰

上古漢語指的是商朝漢朝時期的漢語,其語音依照演進又可細分先秦音系漢代音系。因為上古漢語的構擬不建立在歷史比較語言學的基礎上,因漢語非採拼音文字,故不能從據於不同時代的「拼法」來推斷古讀音。此一原則和印歐語不同。

上古音研究的基本方法是從中古漢語(《切韻》音系)倒推上古音 。在中古音的基礎上,可以用《詩經》的韻部和諧聲系列(形聲字 )來推測古代的發音,還可以用漢語方言的存古特徵和一些外部證據(漢藏語系壯侗語系苗瑤語系等語言中的漢語同源詞借詞 )。

 

JinwenShisongding    周代金文

 

,使得人們可以擬測古音古韻。或許依舊難以想像,於是有人乾脆演給你看,講給你聽︰

 

封神榜上古漢語配音 Fengsheng Bang dubbed with Old Chinese pronunciation

 

。雖說內容之真假虛實,作者不知故沒有評論。但思讓『文本』能『講話』卻也不是一件容易的事!從漢字構造的『六書』說來看,『形聲』之文字極多,維基百科詞條寫道︰

形聲

  • 條例:

許慎:「三曰形聲。形聲者,以事為名,取譬相成,江、河是也。 」[17]

  • 特徵:

在構形上,形聲字的結構很簡單,漢字是由表義(不必準確)的「形符」(或稱「意符」、「義符」),加上表音(不必準確)的「聲符」,所構成的。

漢字是語素文字,語素是音義結合的最小單位,形聲的構形因此是根據音節和語義以構造文字。

  • 辨析:

形聲字形符表構意,聲符表聲,似乎清楚明白;然而,極多形聲字 ,聲符同時兼有意符的作用,形聲與會意(甚至象形)不分。

以「趾」為例,從足,止聲。但是,止,甲骨文作止,象腳掌上腳趾之形,因此,「止」不但是聲符,也是象形的形符,趾,既是從足止聲的形聲字,更是從足從止的合體象形字。

這種聲符表義,或者既形聲又會意的漢字極多。這種現象,宋人王聖美稱之為「右文說」,清人段玉裁稱為「形聲兼會意」[18]

  • 漢字學意義:

李孝定:「文字起源於圖晝,這是大家所公認的,俱備了形和意,一旦與語言相接合,賦予了圖畫以語言的音,於是俱備了形、音、義等構成文字的三要件, 便成為原始的象形文字,這是屬於表形階段;指事已屬表意文字,它本身是從表形過渡到表意意階段的中間產物;會意自然是表意文字的主體,它是以象形為基礎而 產生的;假借則已進入了表音階段,而且只有它纔是純粹的表音文字,形聲字則是了它的啟示纔產生的;但形聲字一旦產生,立即令所有造字方法失去光彩,它不但 成為表音文字的主流,也成為所有文字的主流,後世新增的文字,幾乎全是形聲的天下,漢字的結構,已完全成熟,無須採用其他的方法了。」[19]

裘錫圭:『最早的形聲字不是直接用意符和音符組成,而是通過在假借字上加注意符或在表意字上加注音符而產生的。就是在形聲字大量出現之後,直接用意符和音符組成形聲字,如清末以來為了翻譯西洋自然科學,特別是化學上的某些專門名詞,而造「鋅」、「鐳」、鈾』等形字的情況,仍是不多見的。』[20]

劉學倫也說:「我們如果將運用轉注、假借方法所產生的形聲字,從形聲字中剔除,那麼真正運用形聲造字方法所產生的形聲字,實際上並沒有我們想像的那麼多。」[21]

王寧 《漢字構形講座》:「從早期聲字的來源看,它們不但不是表音性的產物,而且明顯是漢字頑強堅持表意性的結構。用加聲符來強化象形文字的方法之所以很快就不 再使用是因為這種做法沒有增加信息,與表意文字的性質不相適應。而其他幾類形聲增加的都是意義信息,聲符是因為加義符被動轉化而成的。所以,形聲字是以義符為綱的。當形聲字的聲義結合的格局形成後,也有一些字是由一個義符和一個音符合成的,這種形聲字也是以義符為綱的,以音符作為區別作用的。」[22]

 

。若想即使是同一『形聲』字,不同方言的『讀音』也大不相同。加之以世代變遷,可知『聲韻』的考據實在困難。更遑論『言語』還有『聲調』與『表情』的呢??如是我們就容易了解,設計一個程式能將

『文本』→『講話』

,宛如鄉里人聲自然呈現,誠難矣哉!!

 

其實中國也有『標音文字』,她就是人世間獨一無二的

女書

 

王澄溪女書‧書法

 

女書,又名江永女書,是一種獨特的漢語書寫系統。它是一種專門給女性使用的文字,起源於中國湖南省南部永州江永縣。其一般被用來書寫屬於湘語永全片江永城關方言。以前在江永縣及其毗鄰的道縣江華瑤族自治縣大瑤山、以及廣西部分地區的婦女之間流行、傳承。由於女書在文化大革命期間被嚴重破壞,再加上隨著時代的發展女性文化水平的提高,現在女書正瀕臨滅絕。

女書的文字特點

女書文字的特點是書寫呈長菱形,字體秀麗娟細,造型奇特,也被稱為「蚊形字」。搜集到的有近2000個字符,所有字符只有點、豎 、斜、弧四種筆劃,可採用當地的江永土話(屬湘語永全片)吟誦或詠唱。

與漢字不同的地方是:女書是一種標音文字,每一個字所代表的都是一個音。現時文獻搜集到的女書文字約有七百個。女書的字型雖然參考漢字,但兩者並沒有必然的關係。而且,由於女書除了日常用作書寫以外,也可以當成花紋編在衣服或布帶上,所以字型或多或少也有所遷就,變成彎彎的形狀。

女書的起源

關於女書的起源,有不同的說法。

  1. 有人根據當地婦女賽祠的花山廟興起在清代中期,結合目前發現最早的「女書」實物,推測「女書」起源於明末清初。
  2. 有人以「女書」中存在與壯、瑤等民族織錦上的編織符號類同的字符為據,認為「女字的構成源於百越記事符號」。
  3. 有人根據「女書」中大量與出土刻劃符號、彩陶圖案相類似的字符,認為其起源的時間、空間可追溯到新石器時代的仰韶文化,形成於秦始皇統一中國文字之後。
  4. 有人依據「女書」文字與原始古夷(彝)文的基本筆劃,造字法類同,認為它是帝時代的官方文字。
  5. 有人根據甲骨文金文借字在「女書」字彙明顯存在的特徵。認為女書是一種與甲骨文有密切關係的商代古文字的變種。
  6. 也有人認為現代「女書」是古越文字的孑遺和演變。這種觀點認為:象形字會意字是文字體系中最早產生的文化現象,是文字創造者所處生活環境和社會文化的直接反映。這是根據「女書」象形字、會意字構成中反映的文身習俗、「干欄」住宅建築特色 、稻作文化及鳥圖騰文化現象來判斷的。

存在原因

女書的存在,主要是由於中國過去的舊思想使當地女性不可以讀書識字:即她們所謂的「男書」,所以當地的女性發明了女書,以作為姊妹妯娌之間的秘密通訊方式。女書嚴禁男子學習,而一般男子亦會把女書當成是普通的花紋。女書的存在已經超過數百年。

 

。據知已經送交《ISO/UCS女書編碼提案》,

Unicode

早在2008年4月25日,女書在ISO 10646的排程已經進入第三階段[2],但要到2009年夏天的Unicode Consortium會議中才被列入5.2版標準投票。2013年9月27日,女書排程進入第五階段。截至目前(2015年6月)為止,Unicode的最新版本為8.0,但女書仍未被包括[3],仍然是測試中的腳本,獲編配的碼區段是:U+1B100-1B28F[4]

《ISO/UCS女書編碼提案》以及《女書字表》、《女書字型檔》、《女書用字比較》等文件由清華大學搶救女書SRT小組同學製作,而他們亦是向相關機構提交編碼提案的代表。

 

ISO‧UCS女書編碼提案

 

祝願能早日提案成功。