【鼎革‧革鼎】︰ Raspbian Stretch 《六之 K.3-言語界面-5中 》

甌北詩話·蘇東坡詩》 清·趙翼

以文為詩,自昌黎始;至東坡益大放厥詞,別開生面,成一代之大觀 。今試平心讀之,大概才思橫溢,觸處生春,胸中書卷繁富,又足以供其左旋右抽,無不如志。其尤不可及者,天生健筆一枝,爽如哀梨 ,快如并剪,有必達之隱,無難顯之情,此所以繼李、杜後為一大家也。而其不如李、杜處,亦在此。蓋李詩如高闃游空,杜詩如喬嶽之矗天,蘇詩如流水之行地。讀詩者於此處著眼,可得三家之真矣。

坡詩不尚雄傑一派,其絕人處在乎議論英爽,筆鋒精銳,舉重若輕,讀之似不甚用力,而力已透十分,此天才也。試即其詩,略為舉似。五古如:”讀書想前輩,每恨生不早。紛紛少年場,猶得見此老。”《哭刁景純》”餘光幸分我,不死安可獨。”《答陳季常》”丈夫貴出世 ,功名豈人傑。”《和陶詩》”年來萬事足,所欠惟一死。”《海外歸贈鄭秀才》七古如:”當其下手風雨快,筆所未到氣已吞。”《題王維吳道子畫》”世人豈不碩且好,身雖未病中已疲。此叟神完中有恃,談笑可耷熊羆。至今遣像兀不語,與昔未死無增虧。”《題楊惠之塑維摩像 》”雖無尺與寸刀,口吻排擊含風霜。”《送劉道原》”顏公變法出新意 ,細筋入骨如秋蠅。徐家父子亦秀絕,字外出力中藏棱。”《墨妙亭詩 》”耕田欲雨刈欲晴,去得順風來者怨。若使人人禱輒遂,造物應須日千變。”《泗州僧伽塔》”我從山水窟中來,尚愛此山看不足。”《游道場山河山》”世上小兒誇疾走,如君相待今安有!”《往富陽李節推先行留風水洞見待》”黃雞催曉不須愁,老盡世人非我獨。”《與宗同年飲》”覺來落筆不經意,神妙獨到秋毫顛。”《題吳道子畫》”長松千尺不自覺,企而羨者蓬與蒿。”《趙閱道高齋詩》”腳力盡時山更好,莫將有限趁無窮。”《登玲瓏山詩》此皆坡詩中最上乘,讀者可見其才分之高,不在功力之苦也。

 

看人舉重若輕,其來有自乎?且依 nickoala 文本作番環境驗證也。

Know the Sound Cards

pi@raspberrypi:~ more /proc/asound/cards  0 [ALSA           ]: bcm2835 - bcm2835 ALSA                       bcm2835 ALSA  1 [seeed4micvoicec]: seeed-4mic-voic - seeed-4mic-voicecard                       seeed-4mic-voicecard pi@raspberrypi:~ 

The first is Raspberry Pi’s built-in sound card. It has an index of 0. (Note the word ALSA. It means Advanced Linux Sound Architecture. Simply put, it is the sound driver on many Linux systems.)

The second is the USB device’s sound card. It has an index of 1.

Your settings might be different. But if you are using Pi 3 with Jessie and have not changed any sound settings, the above situation is likely. For the rest of discussions, I am going to assume:

  • Built-in sound card, index 0 → headphone jack → speaker
  • USB sound card, index 1 → microphone

The index is important. It is how you tell Raspberry Pi where the speaker and microphone is.

我們用的是 ReSpeaker 4Mic。

 

Test the speaker

pi@raspberrypi:~ speaker-test -t wav  speaker-test 1.1.3  Playback device is default Stream parameters are 48000Hz, S16_LE, 1 channels WAV file(s) Rate set to 48000Hz (requested 48000Hz) Buffer size range from 512 to 65536 Period size range from 512 to 65536 Using max buffer size 65536 Periods = 4 was set period_size = 16384 was set buffer_size = 65536  0 - Front Left Time per period = 0.385015  0 - Front Left Time per period = 1.362121  0 - Front Left Time per period = 1.369963  0 - Front Left ^CTransfer failed: 錯誤的位址 pi@raspberrypi:~ 

 

Press Ctrl-C when done.

系統喇叭 OK 。

 

Record a WAV file

Enter this command, then speak to the mic, press  Ctrl-C when you are finished:

pi@raspberrypi:~ arecord -D plughw:1,0 abc.wav Recording WAVE 'abc.wav' : Unsigned 8 bit, Rate 8000 Hz, Mono arecord: set_params:1363: Unable to install hw params: ACCESS:  RW_INTERLEAVED FORMAT:  U8 SUBFORMAT:  STD SAMPLE_BITS: 8 FRAME_BITS: 8 CHANNELS: 1 RATE: 8000 PERIOD_TIME: 125000 PERIOD_SIZE: 1000 PERIOD_BYTES: 1000 PERIODS: 4 BUFFER_TIME: 500000 BUFFER_SIZE: 4000 BUFFER_BYTES: 4000 TICK_TIME: 0 pi@raspberrypi:~

 

pi@raspberrypi:~ arecord -D plughw:1,0 -r 48000 abc.wav Recording WAVE 'abc.wav' : Unsigned 8 bit, Rate 48000 Hz, Mono ^CAborted by signal 中斷... pi@raspberrypi:~

-D plughw:1,0 tells arecord where the device is. In this case, device is the mic. It is at index 1.

plughw:1,0 actually refers to “Sound Card index 1, Subdevice 0”, because a sound card may house many subdevices. Here, we don’t care about subdevices and always give it a 0. The only important index is the sound card’s.

ReSpeaker 預設的 2ch Mic 取樣頻率為 48K 。

 

Play a WAV file

pi@raspberrypi:~ aplay -D plughw:0,0 abc.wav Playing WAVE 'abc.wav' : Unsigned 8 bit, Rate 48000 Hz, Mono pi@raspberrypi:~

Here, we tell aplay to play to plughw:0,0, which refers to “Sound Card index 0, Subdevice 0”, which leads to the speaker.

If you aplay and arecord successfully, that means the speaker and microphone are working properly. We can move on to add more capabilities.

錄音播放 OK 。

 

Install Pico, the Text-to-Speech engine

pi@raspberrypi:~ pico2wave -w abc.wav "Good morning. How are you today?" pi@raspberrypi:~ aplay -D plughw:0,0 abc.wav
Playing WAVE 'abc.wav' : Signed 16 bit Little Endian, Rate 16000 Hz, Mono
pi@raspberrypi:~ </pre> <span style="color: #666699;">測試 OK 。</span>    <h2><span style="color: #ff9900;">Install Pocketsphinx, the Speech-to-Text engine</span></h2> <pre class="lang:default decode:true"> sudo apt-get install pocketsphinx                     # Jessie
sudo apt-get install pocketsphinx pocketsphinx-en-us  # Stretch pocketsphinx_continuous -adcdev plughw:1,0 -inmic yes

pocketsphinx_continuous interprets speech in real-time. It will spill out a lot of stuff, ending with something like this:

pi@raspberrypi:~ $ pocketsphinx_continuous -adcdev plughw:1,0 -inmic yes
INFO: pocketsphinx.c(145): Parsed model-specific feature parameters from /usr/share/pocketsphinx/model/en-us/en-us/feat.params
Current configuration:
[NAME]			[DEFLT]		[VALUE]
-agc			none		none
-agcthresh		2.0		2.000000e+00
-allphone				
-allphone_ci		no		no
-alpha			0.97		9.700000e-01
-ascale			20.0		2.000000e+01
-aw			1		1
-backtrace		no		no
-beam			1e-48		1.000000e-48
-bestpath		yes		yes
-bestpathlw		9.5		9.500000e+00
-ceplen			13		13
-cmn			current		current
-cmninit		8.0		40,3,-1
-compallsen		no		no
-debug					0
-dict					/usr/share/pocketsphinx/model/en-us/cmudict-en-us.dict
-dictcase		no		no
-dither			no		no
-doublebw		no		no
-ds			1		1
-fdict					/usr/share/pocketsphinx/model/en-us/en-us/noisedict
-feat			1s_c_d_dd	1s_c_d_dd
-featparams				/usr/share/pocketsphinx/model/en-us/en-us/feat.params
-fillprob		1e-8		1.000000e-08
-frate			100		100
-fsg					
-fsgusealtpron		yes		yes
-fsgusefiller		yes		yes
-fwdflat		yes		yes
-fwdflatbeam		1e-64		1.000000e-64
-fwdflatefwid		4		4
-fwdflatlw		8.5		8.500000e+00
-fwdflatsfwin		25		25
-fwdflatwbeam		7e-29		7.000000e-29
-fwdtree		yes		yes
-hmm					/usr/share/pocketsphinx/model/en-us/en-us
-input_endian		little		little
-jsgf					
-keyphrase				
-kws					
-kws_delay		10		10
-kws_plp		1e-1		1.000000e-01
-kws_threshold		1		1.000000e+00
-latsize		5000		5000
-lda					
-ldadim			0		0
-lifter			0		22
-lm					/usr/share/pocketsphinx/model/en-us/en-us.lm.bin
-lmctl					
-lmname					
-logbase		1.0001		1.000100e+00
-logfn					
-logspec		no		no
-lowerf			133.33334	1.300000e+02
-lpbeam			1e-40		1.000000e-40
-lponlybeam		7e-29		7.000000e-29
-lw			6.5		6.500000e+00
-maxhmmpf		30000		30000
-maxwpf			-1		-1
-mdef					/usr/share/pocketsphinx/model/en-us/en-us/mdef
-mean					/usr/share/pocketsphinx/model/en-us/en-us/means
-mfclogdir				
-min_endfr		0		0
-mixw					
-mixwfloor		0.0000001	1.000000e-07
-mllr					
-mmap			yes		yes
-ncep			13		13
-nfft			512		512
-nfilt			40		25
-nwpen			1.0		1.000000e+00
-pbeam			1e-48		1.000000e-48
-pip			1.0		1.000000e+00
-pl_beam		1e-10		1.000000e-10
-pl_pbeam		1e-10		1.000000e-10
-pl_pip			1.0		1.000000e+00
-pl_weight		3.0		3.000000e+00
-pl_window		5		5
-rawlogdir				
-remove_dc		no		no
-remove_noise		yes		yes
-remove_silence		yes		yes
-round_filters		yes		yes
-samprate		16000		1.600000e+04
-seed			-1		-1
-sendump				/usr/share/pocketsphinx/model/en-us/en-us/sendump
-senlogdir				
-senmgau				
-silprob		0.005		5.000000e-03
-smoothspec		no		no
-svspec					0-12/13-25/26-38
-tmat					/usr/share/pocketsphinx/model/en-us/en-us/transition_matrices
-tmatfloor		0.0001		1.000000e-04
-topn			4		4
-topn_beam		0		0
-toprule				
-transform		legacy		dct
-unit_area		yes		yes
-upperf			6855.4976	6.800000e+03
-uw			1.0		1.000000e+00
-vad_postspeech		50		50
-vad_prespeech		20		20
-vad_startspeech	10		10
-vad_threshold		2.0		2.000000e+00
-var					/usr/share/pocketsphinx/model/en-us/en-us/variances
-varfloor		0.0001		1.000000e-04
-varnorm		no		no
-verbose		no		no
-warp_params				
-warp_type		inverse_linear	inverse_linear
-wbeam			7e-29		7.000000e-29
-wip			0.65		6.500000e-01
-wlen			0.025625	2.562500e-02

INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(164): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /usr/share/pocketsphinx/model/en-us/en-us/mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/share/pocketsphinx/model/en-us/en-us/mdef
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(206): Reading HMM transition probability matrices: /usr/share/pocketsphinx/model/en-us/en-us/transition_matrices
INFO: acmod.c(117): Attempting to use PTM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/share/pocketsphinx/model/en-us/en-us/means
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/share/pocketsphinx/model/en-us/en-us/variances
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(354): 222 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file /usr/share/pocketsphinx/model/en-us/en-us/sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(835): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 138623 * 20 bytes (2707 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /usr/share/pocketsphinx/model/en-us/cmudict-en-us.dict
INFO: dict.c(213): Allocated 1014 KiB for strings, 1677 KiB for phones
INFO: dict.c(336): 134522 words read
INFO: dict.c(358): Reading filler dictionary: /usr/share/pocketsphinx/model/en-us/en-us/noisedict
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 21336 bytes (20 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 21336 bytes (20 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(456): Trying to read LM in trie binary format
INFO: ngram_search_fwdtree.c(99): 790 unique initial diphones
INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 57 single-phone words
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 57 single-phone words
INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 152144
INFO: ngram_search_fwdtree.c(339): after: 722 root, 152016 non-root channels, 53 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
INFO: continuous.c(305): pocketsphinx_continuous COMPILED ON: May 22 2016, AT: 22:01:16

READY....
Listening...

Now, speak into the mic, and note the results. At first, you may find it funny. After a while, you realize it is horribly inaccurate.

確認。

相信一路走來讀者,自然得手應心吧◎