有如
《 M♪o 之學習筆記本《子》開關》文本所說︰
此處『…』者並非『留白』不譯,因為那是 M♪o 自己的先行練習,轉譯需要一個『軟體環境』,於此假設我們主要使用︰
‧ Python3
‧ python-rpi.gpio
‧ wiringPi
‧ ……
在此選擇 Python3 的原因,是因為如
W!o 的派生‧十日談之《六》 一文上所講︰
然而從 Python 3.0 開始,『派生』已經支持『萬國碼』 Unicode 的『標識符』,這意味著︰
# 可以使用中文『標識符』 >>> 甲=7 >>> 乙=甲*甲-3*甲+2 >>> 甲;乙 7 30 >>>
。這使得翻譯『軟體習作』更貼近 M♪o 的精神。同時假設讀者使用的是已經中文化了的官方版 Raspbian 發行版。
所解釋的理由,作者依舊還是選擇了『派生三』︰
pi@raspberrypi ~ $ python3 Python 3.4.2 (default, Oct 19 2014, 13:31:11) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information. >>>
雖說每種語言與各個國家的人都能『學習』,畢竟『母語』是人們最先嫻熟的『言語』。因此人類用自己的『語言』理解相同之自然萬象,實有『異中有同』的旨趣,或許那才真是 M♪o 發展『算籌』程式語言之原意的吧!她也許在『教育』『實踐』中,已經早對
《物理哲學·下中……… 》所說之事︰
有些長期從事科學的教育者,發現數理學習的困難度,可以排列成『邏輯』<『數學』<『機率』這樣的次序。這可讓人覺得十分有意思,難道是說『必然的』<『抽象的』<『不確定』?或許人們不能輕易覺察之『無意識』的『參照點』就是對事物觀點『兩極化』的由來。就好像在《改不改??變不變!!》一文中所談到的一些『悖論』彷彿是『腦筋急轉彎』的一般!
有了深刻的『體認』,所以才得以將之融會成一爐的乎?作者既然想要『紀實』,自然只能仿之以相類事物的了。『邏輯編程』的選擇倒還算簡單︰
sudo pip3 install pyDatalog
,有興趣的讀者尚且可以讀
《勇闖新世界︰《邏輯編程》》系列文本
作個開始。那個『數理統計』之部份,除了
文中略提一二之外︰
《説文解字》:壬,位北方也。陰極陽生 ,故《易》曰:“龍戰于野。”戰者,接也 。象人褢妊之形。承亥壬以子,生之敘也 。與巫同意。壬承辛,象人脛。脛,任體也。凡壬之屬皆从壬。
本義:善於使用巧具,勝任事務。
《論語》‧泰伯第八
曾子曰:士不可以不弘毅,任重而道遠,仁以為己任,不亦重乎!死而後已,不亦遠乎!
曾子曰:以能問於不能,以多問於寡;有若無,實若虛,犯而不校;昔者吾友【顏淵】,嘗從事於斯矣。
假使不知道『仁』的『當量』,要如何計算『一輩子』為己任,能有多『重』呢?又不知道是否以『天下』為己任,『一生』又將走多『遠』的呢??然而這卻不礙『會意』,即使以待『一人之仁』之『重』對當於『那人體重』來算,想來大概也心中有數的了!
宋代釋贊寧《宋高僧傳》中記載李渤任江州刺史時曾與白居易一起去拜訪過智常禪師:李問曰:『教中有言,須彌納芥子芥子納須彌,如何芥子納得須彌?』常曰: 『人言博士學覽萬卷書籍還是否耶?』李曰:『忝此虛名。』常曰:『摩踵至頂只若干尺身,萬卷書向何處著?』李俯首無言。再思稱嘆。
就像老子所講的︰『虛其心,實其腹。』,正是顏淵學習之態度︰『有若無,實若虛』,豈非『芥子』之『心』可納『須彌』 的耶 !妙哉,大自然 在 1 公克 DNA 中可以儲存 360 EB 的信息量!!據聞物理學『弦論』中的『全像原理』 Holographic principle 述說著人類目前尚未理解的奧秘︰
全像原理認為目前所見的宇宙是真實宇宙的投影。以較宏觀的觀點來看,此原理指出了整個宇宙可視為一個呈現在宇宙學視界上的二維資訊結構,而日常觀察到的三維空間則是巨觀尺度且低能量的有效描述。值得注意的是,宇宙學全像原理在數學上仍未達精確。
,果真是宇宙人生現象之『大數據』 Big data 的哩!!
IBM 對維基百科的編輯紀錄資料進行視覺化的呈現。維基百科上總計數兆位元組的文字和圖片正是大資料的例子之一
全球資訊儲存容量成長圖
Cartoon critical of big data application, by T. Gregorius
【定義】是
大數據由巨型資料集組成,這些資料集大小常超出人類在可接受時間下的收集、庋用、管理和處理能力。大數據的大小經常改變 ,截至 2012 年,單一資料集的大小從數兆位元組(TB)至數十兆億位元組(PB)不等。
在一份 2001 年的研究與相關的演講中,麥塔集團 ( META Group,現為高德納)分析員道格‧萊尼(Doug Laney)指出資料增長的挑戰和機遇有三個方向:量(Volume,資料大小)、速(Velocity,資料輸入輸出的速度)與多變(Variety,多樣性),合稱「 3V 」或「 3Vs 」。高德納與現在大部份大數據產業中的公司,都繼續使用 3V 來描述大數據。高德納於 2012 年修改了對大數據的定義:「大數據是大量、高速、及/或多變的資訊資產,它需要新型的處理方式去促成更強的決策能力、洞察力與最佳化處理。」另外,有機構在 3V 之外定義了第 4 個 V:真實性(Veracity)為第四特點。
大數據必須藉由計算機對資料進行統計、比對、解析方能得出客觀結果。美國在 2012 年就開始著手大數據,歐巴馬更在同年投入 2 億美金在大數據的開發中,更強調大數據會是之後的未來石油。
資料探勘(data mining)則是在探討用以解析大數據的方法。
再加之以『派生火焰』 Python blaze 的與時變遷︰
The Blaze Ecosystem provides Python users high-level access to efficient computation on inconveniently large data. Blaze can refer to both a particular library as well as an ecosystem of related projects that have spun off of Blaze development.
Blaze is sponsored primarily by Continuum Analytics, and a DARPA XDATA grant.
Parts of the Blaze ecosystem are described below:
Ecosystem
Several projects have come out of Blaze development other than the Blaze project itself.
-
The Blaze Project: Translates NumPy/Pandas-like syntax to data computing systems (e.g. database, in-memory, distributed-computing). This provides Python users with a familiar interface to query data living in a variety of other data storage systems. One Blaze query can work across data ranging from a CSV file to a distributed database.
Blaze presents a pleasant and familiar interface to us regardless of what computational solution or database we use (e.g. Spark, Impala, SQL databases, No-SQL data-stores, raw-files). It mediates our interaction with files, data structures, and databases, optimizing and translating our query as appropriate to provide a smooth and interactive session. It allows the data scientists and analyst to write their queries in a unified way that does not have to change because the data is stored in another format or a different data-store. It also provides a server-component that allows URIs to be used to easily serve views on data and refer to Data remotely in local scripts, queries, and programs.
-
DataShape: A data type system
DataShape combines NumPy’s dtype and shape and extends to missing data, variable length strings, ragged arrays, and more arbitrary nesting. It allows for the common description of data types from databases to HDF5 files, to JSON blobs.
-
Odo: Migrates data between formats.
Odo moves data between formats (CSV, JSON, databases) and locations (local, remote, HDFS) efficiently and robustly with a dead-simple interface by leveraging a sophisticated and extensible network of conversions.
-
DyND: In-memory dynamic arrays
DyND is a dynamic ND-array library like NumPy that implements the datashape type system. It supports variable length strings, ragged arrays, and GPUs. It is a standalone C++ codebase with Python bindings. Generally it is more extensible than NumPy but also less mature.
-
Dask.array: Multi-core / on-disk NumPy arrays
Dask.dataframe : Multi-core / on-disk Pandas data-frames
Dask.arrays provide blocked algorithms on top of NumPy to handle larger-than-memory arrays and to leverage multiple cores. They are a drop-in replacement for a commonly used subset of NumPy algorithms.
Dask.dataframes provide blocked algorithms on top of Pandas to handle larger-than-memory data-frames and to leverage multiple cores. They are a drop-in replacement for a subset of Pandas use-cases.
Dask also has a general “Bag” type and a way to build “task graphs” using simple decorators as well as nascent distributed schedulers in addition to the multi-core and multi-threaded schedulers.
These projects are mutually independent. The rest of this documentation is just about the Blaze project itself. See the pages linked to above for datashape
, odo
, dynd
, or dask
.
Blaze
Blaze is a high-level user interface for databases and array computing systems. It consists of the following components:
- A symbolic expression system to describe and reason about analytic queries
- A set of interpreters from that query system to various databases / computational engines
This architecture allows a single Blaze code to run against several computational backends. Blaze interacts rapidly with the user and only communicates with the database when necessary. Blaze is also able to analyze and optimize queries to improve the interactive experience.
雖然驗之以『派生三』可以安裝如下︰
sudo pip3 install blaze
Python 3.4.2 (default, Oct 19 2014, 13:31:11) [GCC 4.9.1] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from blaze import * >>> accounts = Symbol('accounts', 'var * {id: int, name: string, amount: int}') >>> deadbeats = accounts[accounts.amount < 0].name >>> L = [[1, 'Alice', 100], ... [2, 'Bob', -200], ... [3, 'Charlie', 300], ... [4, 'Denis', 400], ... [5, 'Edith', -500]] >>> list(compute(deadbeats, L)) ['Bob', 'Edith'] >>>
至於這個程式庫的用法,希望讀者能夠先行了解的了。