【鼎革‧革鼎】︰ Raspbian Stretch 《六之 K.3-言語界面-6.0 》

欲深入了解『音轉文』 STT Speech To Text ,故動念打造 jupyter 筆記環境,想接軌 MIR 技術也!

 

因此須先確認

Pocketsphinx Python

Latest Version Development Status Supported Python Versions Build Status License

Pocketsphinx is a part of the CMU Sphinx Open Source Toolkit For Speech Recognition.

This package provides a python interface to CMU Sphinxbase and Pocketsphinx libraries created with SWIG and Setuptools.

Supported platforms

  • Windows
  • Linux
  • Mac OS X

Installation

# Make sure we have up-to-date versions of pip, setuptools and wheel:
pip install --upgrade pip setuptools wheel pip install --upgrade pocketsphinx

 

的可行性?

嘗試解決或好︰

pi@raspberrypi:~ $ python3
Python 3.5.3 (default, Jan 19 2017, 14:11:04) 
[GCC 6.3.0 20170124] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pocketsphinx import AudioFile
>>> audio_file = '/usr/local/lib/python3.5/dist-packages/pocketsphinx/data/goforward.raw'
>>> for phrase in AudioFile(audio_file=audio_file): print(phrase)
... 
go forward ten meters
>>> 

 

或壞狀況︰

>>> from pocketsphinx import LiveSpeech
>>> for phrase in LiveSpeech(): print(phrase)
... 
Error opening audio device (null) for capture: Connection refused
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/pocketsphinx/__init__.py", line 206, in __init__
    self.ad = Ad(self.audio_device, self.sampling_rate)
  File "/usr/local/lib/python3.5/dist-packages/sphinxbase/ad.py", line 124, in __init__
    this = _ad.new_Ad(audio_device, sampling_rate)
RuntimeError: new_Ad returned -1
>>> 

 

至少明白錯誤訊息︰

>>> for phrase in LiveSpeech(audio_device='sysdefault'): print(phrase)
... 
Error opening audio device sysdefault for capture: Connection refused
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/pocketsphinx/__init__.py", line 206, in __init__
    self.ad = Ad(self.audio_device, self.sampling_rate)
  File "/usr/local/lib/python3.5/dist-packages/sphinxbase/ad.py", line 124, in __init__
    this = _ad.new_Ad(audio_device, sampling_rate)
RuntimeError: new_Ad returned -1
>>>

 

的原由︰

Q: Failed to open audio device(/dev/dsp): No such file or directory

Device file /dev/dsp is missing because OSS support is not enabled in the kernel. You can either compile pocketsphinx with ALSA support by installing alsa development headers from a package libasound2 or alsa-devel and recompiling or you can install oss-compat package to enable OSS support.

The installation process is not an issue if you understand the complexity of audio subsystems in Linux. The audio subsystem is complex unfortunately, but once you get it things will be easier. Historically, audio subsystem is pretty fragmented. It includes the following major frameworks:

  • Old Unix-like DSP framework – everything is handled by the kernel-space driver. Applications interact with /dev/dsp device to produce and record audio
  • ALSA – newer audio subsystem, partially in kernel but also has userspace library libasound. ALSA also provides DSP compatibliity layer through snd_pcm_oss driver which creates /dev/dsp device and emulates audio
  • Pulseaudio – even newer system which works on the top of libasound ALSA library but provides a sound server to centralize all the processing. To communicate with the library it also provides libpulse library which must be used by applications to record sound
  • Jack – another sound server, also works on the top of ALSA, provides anoher library libjack. Similar to Pulseaudio there are others not very popular frameworks, but sphinxbase doesn’t support them. Example are ESD (old GNOME sound server), ARTS (old KDE sound server), Portaudio (portable library usable across Windows, Linux and Mac).

The recommended audio framework on Ubuntu is pulseaudio.

Sphinxbase and pocketsphinx support all the frameworks and automatically selects the one you need in compile time. The highest priority is in pulseaudio framework. Before you install sphinxbase you need to decide which framework to use. You need to setup the development part of the corresponding framework after that.

For example, it’s recommended to install libpulse-dev package to provide access to pulseaudio and after that sphinxbase will automatically work with Pulseaudio. Once you work with pulseaudio you do not need other frameworks. On embedded device try to configure alsa.

 

說此跌跌撞撞過程,博君一笑吧☆