John Colagioia Thue 語言之參考手冊

如果你前往雲端的『The Thue Programming Language』網頁,你可以知道 Thue 語言目前的動態,下載『 thue-1.5-2012.0916.zip 』。假使你有興趣,你可以進一步研究 Thue 的『解譯器』在 『 C 』、『 Ruby 』和『 Python 』三種語言中的『實作』。本文的主要目的旨在方便讀者,『選譯』了 John Colagioia 所寫的『 Welcome to the Thue Reference Manual 』,

【原文如下】︰

> What is a Thue?

Thue is…uhm…well…

OK, I got it. Thue is a language based on the concept of the semi-Thue
grammar/process, which is named for (and possibly created by) the
Norwegian mathematician Axel Thue (pronounced “TOO-ay”). It is, in
essence, an arbitrary grammar, which can (by its arbitrary nature) be
used to define/recognize “Type 0” languages from Chomsky’s hierarchy.
Because the grammar can be used to define a language of such complexity,
the process, itself, is essentially Turing Complete.

As a result, the Thue language (which, of course, would be much funnier
if it rhymed with the Infocom “dark places” creature, but c’est la vie,
I guess…) is an arbitrary grammar system, not unlike yacc or a similar
beast, except that there is no way to distinguish between a terminal and
a nonterminal symbol–they are completely interchangeable.

> Write “Thue” with burin.

A Thue program consists of two parts: The first part is the set of
grammar/production rules, where each rule has the form:
lhs::=rhs
where the lhs is the string to be recognized, and the rhs is the string
which is to replace the lhs. Each string (lhs and rhs) can be completely
arbitrary, except that the lhs cannot (for rather obvious reasons)
include the production symbol (“::=”). The rhs, however, is not
restricted in any way.

Terminating the rulebase is a production symbol alone on a line, surrounded
by (optional) whitespace. Following that is the description of the
initial state which Thue will work with. Each line after the rulebase is
concatenated to form the initial state.

Once loaded, the Thue program nondeterministically applies the rulebase to
the current state. It continues to do so until no rules apply to the
state (pragmatically, this means that no lhs can be found in the state).

> What is a burin?

Don’t worry about it. It’s an Infocom joke.

Actually, let’s go with it. A burin is tool to inscribe mystical symbols
into an object. Thue has one.

> Examine burin.

Added to this simple system are two strings which are used to permit Thue
to communicate with the outside world.

The first of these is the input symbol (“:::”). The input symbol is
actually the lhs of an implicit rule of which the user (or system’s “input
stream”) is a component. The input symbol, therefore, is replaced by a
line of text received from the “input stream.”

As a counterpart of input, the output symbol (“~”) is supplied. Like the
input symbol, the output symbol triggers an implicit rule which, in this
case, encompasses the “output stream.” The specific effect is that all
text to the right of the output symbol in the rhs of a production is sent
to the output stream.

Note that either (or both) of these implicit rules may be overridden by
providing explicit rules that perform some other task.

> Examine Thue.

The implementation of Thue, itself, is rather uninteresting, except for
three command-line switches:
d Activates “Debug Mode,” which prints the state immediately
after any rule is applied.
l Activates “Left Mode,” which requires Thue to apply rules
deterministically in a left-to-right fashion.
r Activates “Right Mode,” which is identical to “Left Mode,”
except that rule application occurs right-to-left.
The command-line switches must appear after the Thue filename, and the
last incidence of ‘l’ or ‘r’ overrides all others.

> Look under Thue.

Sample programs included are:
dec.t Decrements a binary number
hello.t Hello, World!
inc.t Increments a binary number
incany.t Increments a binary number input by the user
test.t A simple example to highlight nondeterminism

> Amusing.

Apologies to Axel Thue for mangling the pronunciation of his name for a
cheap joke. Apologies to whatever is left of Infocom for (unknowingly)
supplying the format of the cheap joke.

> Exit.

【選譯如下】︰

Thue 是什麼?
Thue 是建基於 semi-Thue 文法或推導程序的語言,得名自匈牙利的數學家 Axel Thue ,以『TOO-ay』為發音。本質上它是一種恣意之文法,我們可以用來『定義』或『辨識』喬姆斯基 Chomsky 之零階語言 ── Type 0 ──。由於它的文法可以定義這樣複雜的語言,從推導程序上講,它的實質是滿足『圖靈的完備性』。
Thue 文法並不區分『終端』或者『非終端』符號,它們完全是可交換的。
Thue 的程式寫作之『雕刻風格』burin 如是︰
它由兩個部份組成,第一個部份是『文法』或叫『生產規則』,每一個規則有『lhs ::= rhs』這樣的『形式』。這裡 lhs 是『左邊字串』,也就是所『被辨識』者,它終將為『右邊字串』所『取代』。不論左右,每個字串都可任意構成,因著自明的理由,左邊字串的字串裡不得含有 『::=』── 生產符號 ──,當然右邊字串不受此限。
文法或生產規則資料庫,由一列『空白文法』 ── 只有『::=』生產符號 ── 所定義,或許你想加入一些『白空間』也可以。這列之後的各列,串接組成程式的『初始狀態』── 起始字串 ──。
一旦程式載入,依著生產規則資料庫『不確定的』nondeterministically 使用哪一條符合之轉換規則於當下狀態── 包含初始狀態 ──,一步一步轉換,直至『無法轉換』為止,實用上講,就是『當下之狀態』已經沒有 lhs 可用了。
Thue 語言有兩個『奧秘的』mystical 之『記名』inscribe 符號︰第一個符號是『:::』代表從『標準輸入設備』取得『輸入串流』input stream 的元件,這時 lhs 之記名符號將由使用者輸入『取代』。另一個符號是『~』對應的表達從『標準輸出設備』送出『輸出串流』output stream 之元件,在它之後的所有 rhs 全都被輸出。之所以是『奧秘的』乃因為它是『隱藏的』,而且可以為程式所『蓋寫的』。
目下 Thue 語言的實施 implementation 有三個開關般的選項,必須放在『Thue 程式』之後︰
d︰除錯模式,當任何一個生產規則被使用後,輸出『當下狀態』;
l︰啟動『左方優先模式』,強制 Thue 的每次轉換都是『從左到右』;
r︰啟動『右方優先模式』,強制 Thue 的每次轉換都是『從右到左』;
這兩個『l,r』必須是命令列『最後的符號』,用以蓋寫『非確定性』的『預設』。