STEM 隨筆︰古典力學︰模擬術【小工具】八《大數據》八

學習貴在使用方法,能夠獨立解決問題◎

所以遲遲不提

派生火焰生態系

The Blaze Ecosystem

The Blaze ecosystem is a set of libraries that help users store, describe, query and process data. It is composed of the following core projects:

  • Blaze: An interface to query data on different storage systems
  • Dask: Parallel computing through task scheduling and blocked algorithms
  • Datashape: A data description language
  • DyND: A C++ library for dynamic, multidimensional arrays
  • Libndtypes: A C/C++ library for a low-level version of Datashape
  • Ndtypes-python: Python bindings for libndtypes
  • Odo: Data migration between different storage systems

【概觀】

Overview

Blaze Abstracts Computation and Storage

Several projects provide rich and performant data analytics. Competition between these projects gives rise to a vibrant and dynamic ecosystem. Blaze augments this ecosystem with a uniform and adaptable interface. Blaze orchestrates computation and data access among these external projects. It provides a consistent backdrop to build standard interfaces usable by the current Python community.

【安裝】

pi@raspberrypi:~ sudo pip3 install blaze</pre> <span style="color: #808080;">【展示】</span> <pre class="lang:default decode:true">pi@raspberrypi:~ ipython3
Python 3.5.3 (default, Jan 19 2017, 14:11:04) 
Type "copyright", "credits" or "license" for more information.

IPython 5.1.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: from blaze import *
/usr/local/lib/python3.5/dist-packages/blaze/server/server.py:17: ExtDeprecationWarning: Importing flask.ext.cors is deprecated, use flask_cors instead.
  from flask.ext.cors import cross_origin

In [2]: 帳戶 = Symbol('帳戶', 'var * {"編號": int, "開戶人": string, "金額": int
   ...: }')

In [3]: 壞帳 = 帳戶[帳戶.金額 < 0].開戶人

In [4]: L = [[1, 'Alice',   100],
   ...:      [2, 'Bob',    -200],
   ...:      [3, 'Charlie', 300],
   ...:      [4, 'Denis',   400],
   ...:      [5, 'Edith',  -500]]

In [5]: list(compute(壞帳,L))
Out[5]: ['Bob', 'Edith']

In [6]:

─ 改自《【鼎革‧革鼎】︰ RASPBIAN STRETCH 《三‧戊》

 

因有新舊版本『相容性』問題也︰

List to Postgres fails with “AttributeError: ‘DiGraph object has no attribute ‘edge'” #588

You need to downgrade your networkx to <2.0 until this is fixed

※  註︰

Install

NetworkX requires Python 2.7, 3.4, 3.5, or 3.6. If you do not already have a Python environment configured on your computer, please see the instructions for installing the full scientific Python stack.

Note

If you are on Windows and want to install optional packages (e.g., scipy), then you will need to install a Python distribution such as Anaconda, Enthought Canopy, Python(x,y), WinPython, orPyzo. If you use one of these Python distribution, please refer to their online documentation.

Below we assume you have the default Python environment already configured on your computer and you intend to install networkx inside of it. If you want to create and work with Python virtual environments, please follow instructions on venv and virtual environments.

First, make sure you have the latest version of pip (the Python package manager) installed. If you do not, refer to the Pip documentation and install pip first.

Install the released version

Install the current release of networkx with pip:

sudo pip3 install networkx==1.11

 

蓋不知『降級』者,恐無法

Quickstart

This quickstart is here to show some simple ways to get started created and manipulating Blaze Symbols. To run these examples, import blaze as follows.

>>> from blaze import *

Blaze Interactive Data

Create simple Blaze expressions from nested lists/tuples. Blaze will deduce the dimensionality and data type to use.

>>> t = data([(1, 'Alice', 100),
...           (2, 'Bob', -200),
...           (3, 'Charlie', 300),
...           (4, 'Denis', 400),
...           (5, 'Edith', -500)],
...          fields=['id', 'name', 'balance'])

>>> t.peek()
   id     name  balance
0   1    Alice      100
1   2      Bob     -200
2   3  Charlie      300
3   4    Denis      400
4   5    Edith     -500

 

哩!眼下假借

Blaze Tutorial

This repository contains notebooks and data for a tutorial for Blaze a library to compute on foreign data from within Python.

This is a work in progress.

Outline

  1. Motivation – (nbviewer)

Into

We present most Blaze fundamentals while discussing the simpler topic of data migration using the into project.

  1. Basics – (nbviewer)
  2. Datatypes – (nbviewer)
  3. Internal Design – (nbviewer)

Blaze Queries

  1. Basics – (nbviewer)
  2. Databases – (nbviewer)
  3. Storing Results – (nbviewer)

 

探索影響範圍?

有興趣的讀者何不自己嘗試呦☆

 

 

 

 

 

 

 

STEM 隨筆︰古典力學︰模擬術【小工具】八《大數據》七

既然 bqplot 文件詳細︰

Introduction

bqplot is a Grammar of Graphics-based interactive plotting framework for the Jupyter notebook.

_images/bqplot-screencast.gif

In bqplot, every single attribute of the plot is an interactive widget. This allows the user to integrate any plot with IPython widgets to create a complex and feature rich GUI from just a few simple lines of Python code.

Goals

  • provide a unified framework for 2-D visualizations with a pythonic API.
  • provide a sensible API for adding user interactions (panning, zooming, selection, etc)

Two APIs are provided

  • Users can build custom visualizations using the internal object model, which is inspired by the constructs of the Grammar of Graphics (figure, marks, axes, scales), and enrich their visualization with our Interaction Layer.
  • Or they can use the context-based API similar to Matplotlib’s pyplot, which provides sensible default choices for most parameters.

 

用法介紹清楚︰

 

又有很多範例︰

bqplot/examples/

 

不必畫蛇添足矣。

 

 

 

 

 

 

 

粽子節☆

祝大家佳節快樂◎

名稱來源

「端」字有「初始」的意思,因此「端五」就是「初五」。而按照曆法,五月正是「午」月,因此「端五」也就漸漸演變成了現在的「端午」[2]。《燕京歲時記》記載:「初五為五月單五,蓋端字之轉音也。」

粽與中諧音
包粽者,包中,有粽吃者,得中也☆

 

 

 

 

 

 

 

 

 

STEM 隨筆︰古典力學︰模擬術【小工具】八《大數據》六

 如何理解一個國家的富裕?假使說百業興盛、市場繁榮、人民所得逐年提高,是否就足以保障那個富裕呢??或者說這個國家不單是富裕,而且走在通往富強的康莊大道上哩。人們需要理由以及數據解釋當下發生的事,更想有水晶球預測未來的趨勢!由於事物總在時流裡變化,在在促使人門研究

時間序列

時間序列英語:time series實證經濟學的一種統計方法

內涵

時間序列是用時間排序的一組隨機變量,國內生產毛額(GDP)、消費者物價指數(CPI)、台灣加權股價指數、利率、匯率等等都是時間序列。

時間序列的時間間隔可以是分秒(如高頻金融數據),可以是日、周、月、季度、年、甚至更大的時間單位。

時間序列是計量經濟學所研究的三大數據形態(另兩大為橫截面數據和縱面數據)之一,在總體經濟學國際經濟學金融學金融工程學等學科中有廣泛應用。

時間序列變量的特徵

  • 非平穩性(nonstationarity,也譯作不平穩性非穩定性):即時間序列的變異數無法呈現出一個長期趨勢並最終趨於一個常數或是一個線性函數
  • 波動幅度隨時間變化(Time-varying Volatility):即一個時間序列變量的變異數隨時間的變化而變化

這兩個特徵使得有效分析時間序列變量十分困難。

平穩型時間數列(Stationary Time Series)係指一個時間數列其統計特性將不隨時間之變化而改變者。

傳統的計量經濟學的假設

  1. 假設時間序列變量是從某個隨機過程中隨機抽取並按時間排列而形成的,因而一定存在一個(狹義)穩定趨勢(stationarity),即:平均值是固定的
  2. 假定時間序列變量的波動幅度不隨時間改變,即:變異數是固定的。但這明顯不符合實際,人們早就發現股票收益的波動幅度是隨時間而變化的,並非常數

這兩個假設使得傳統的計量經濟學方法對實際生活中的時間序列變量無法有效分析。克萊夫·格蘭傑羅伯特·恩格爾的貢獻解決了這個問題。

!尤其今天已是大數據的時代,人們嚮往掌握方法,能夠藉著資訊煉金也!!

Time series: random data plus trend, with best-fit line and different applied filters

─── 《時間序列︰從微觀到巨觀

 

由於作者曾寫過一系列《時間序列︰ □ □ 》文本,實不宜再多贅述也。此處援引《時間序列︰安斯庫姆四重奏》,強調『視覺化』之重要性耳︰

『眼見為憑』,明白統計數據間之局部與全局關係,此乃

安斯庫姆四重奏

安斯庫姆四重奏Anscombe’s quartet)是四組基本的統計特性一致的數據,但由它們繪製出的圖表則截然不同。每一組數據都包括了11個(x,y)點。這四組數據由統計學家弗朗西斯·安斯庫姆(Francis Anscombe)於1973年構造,他的目的是用來說明在分析數據前先繪製圖表的重要性,以及離群值對統計的影響之大。

安斯庫姆四重奏的四組數據圖表

這四組數據的共同統計特性如下:

性質 數值
x平均數 9
x方差 11
y的平均數 7.50(精確到小數點後兩位)
y的方差 4.122或4.127(精確到小數點後三位)
xy之間的相關係數 0.816(精確到小數點後三位)
線性回歸 \displaystyle y=3.00+0.500x(分別精確到小數點後兩位和三位)

在四幅圖中,由第一組數據繪製的圖表(左上圖)是看起來最「正常」的,可以看出兩個隨機變量之間的相關性。從第二組數據的圖表(右上圖)則可以明顯 地看出兩個隨機變量間的關係是非線性的。第三組中(左下圖),雖然存在著線性關係,但由於一個離群值的存在,改變了線性回歸線,也使得相關係數從1降至 0.81。最後,在第四個例子中(右下圖),儘管兩個隨機變量間沒有線性關係,但僅僅由於一個離群值的存在就使得相關係數變得很高。

愛德華·塔夫特(Edward Tufte)在他所著的《圖表設計的現代主義革命》(The Visual Display of Quantitative Information)一書的第一頁中,就使用安斯庫姆四重奏來說明繪製數據圖表的重要性。

四組數據的具體取值如下所示。其中前三組數據的x值都相同。

安斯庫姆四重奏
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89

 

的主旋律乎?如是者將會知道概念間的關係及其先後次序之重要性吧!

 

說來『資料圖示』想法古早矣︰

Exploratory data analysis

In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis testing task. Exploratory data analysis was promoted by John Tukey to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA),[1] which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed. EDA encompasses IDA.

Overview

Tukey defined data analysis in 1961 as: “[P]rocedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data.”[2]

Tukey’s championing of EDA encouraged the development of statistical computing packages, especially S at Bell Labs. The S programming language inspired the systems ‘S’-PLUS and R. This family of statistical-computing environments featured vastly improved dynamic visualization capabilities, which allowed statisticians to identify outliers, trends and patterns in data that merited further study.

Tukey’s EDA was related to two other developments in statistical theory: robust statistics and nonparametric statistics, both of which tried to reduce the sensitivity of statistical inferences to errors in formulating statistical models. Tukey promoted the use of five number summary of numerical data—the two extremes (maximum and minimum), the median, and the quartiles—because these median and quartiles, being functions of the empirical distribution are defined for all distributions, unlike the mean and standard deviation; moreover, the quartiles and median are more robust to skewed or heavy-tailed distributions than traditional summaries (the mean and standard deviation). The packages S, S-PLUS, and R included routines using resampling statistics, such as Quenouille and Tukey’s jackknife and Efron‘s bootstrap, which are nonparametric and robust (for many problems).

Exploratory data analysis, robust statistics, nonparametric statistics, and the development of statistical programming languages facilitated statisticians’ work on scientific and engineering problems. Such problems included the fabrication of semiconductors and the understanding of communications networks, which concerned Bell Labs. These statistical developments, all championed by Tukey, were designed to complement the analytic theory of testing statistical hypotheses, particularly the Laplacian tradition’s emphasis on exponential families.[3]

Development

John W. Tukey wrote the book Exploratory Data Analysis in 1977.[4] Tukey held that too much emphasis in statistics was placed on statistical hypothesis testing (confirmatory data analysis); more emphasis needed to be placed on using data to suggest hypotheses to test. In particular, he held that confusing the two types of analyses and employing them on the same set of data can lead to systematic biasowing to the issues inherent in testing hypotheses suggested by the data.

The objectives of EDA are to:

Many EDA techniques have been adopted into data mining, as well as into big data analytics.[6] They are also being taught to young students as a way to introduce them to statistical thinking.[7]

Data science process flowchart

 

技術一籮筐︰

Techniques

There are a number of tools that are useful for EDA, but EDA is characterized more by the attitude taken than by particular techniques.[8]

Typical graphical techniques used in EDA are:

Typical quantitative techniques are:

 

這也是 bqplot 設計之旨呦!

 

 

 

 

 

 

 

STEM 隨筆︰古典力學︰模擬術【小工具】八《大數據》五

一個百年前的悖論

伯特蘭悖論 (機率論)

伯特蘭悖論是一個有關機率論的傳統解釋會導致的悖論。約瑟·伯特蘭於1888年在他的著作《Calcul des probabilités》中提到此悖論,用來舉例說明,若產生隨機變數的「機制」或「方法」沒有清楚定義好的話,機率也將無法得到良好的定義。

伯特蘭悖論的內容

伯特蘭悖論的內容如下:考慮一個內接於的等邊三角形。若隨機選方圓上的個弦,則此弦的長度比三角形的邊較長的機率為何?

伯特蘭給出了三個論證,全都是明顯有效的,但導致的結果都不相同。

  1. 隨機的弦,方法1;紅=比三角形的邊較長,藍=比三角形的邊較短

    「隨機端點」方法:在圓周上隨機選給兩點,並畫出連接兩點的弦。為了計算問題中的機率,可以想像三角形會旋轉,使得其頂點會碰到弦端點中的一點。可觀察 到,若另一個弦端點在弦會穿過三角形的一邊的弧上,則弦的長度會比三角形的邊較長。而弧的長度是圓周的三分之一,因此隨機的弦會比三角形的邊較長的機率亦 為三分之一。

  2. 隨機的弦,方法2

    「隨機半徑」方法:選擇一個圓的半徑和半徑上的一點,再畫出通過此點並垂直半徑的弦。為了計算問題的機率,可以想像三角形會旋轉,使得其一邊會垂直於半 徑。可觀察到,若選擇的點比三角形和半徑相交的點要接近圓的中心,則弦的長度會比三角形的邊較長。三角形的邊會平分半徑,因此隨機的弦會比三角形的邊較長 的機率亦為二分之一。

  3. 隨機的弦,方法3

    「隨機中點」方法:選擇圓內的任意一點,並畫出以此點為中點的弦。可觀察到,若選擇的點落在半徑只有大圓的半徑的二分之一的同心圓之內,則弦的長度會比三角形的邊較長。小圓的面積是大圓的四分之一,因此隨機的弦會比三角形的邊較長的機率亦為四分之一。

上述方法可以如下圖示。每一個弦都可以被其中點唯一決定。上述三種方法會給出不同中點的分布。方法1和方法2會給出兩種不同不均勻的分布,而方法3則會給出一個均勻的方法。但另一方面,若直接看弦的分布,方法2的弦會看起來比較均勻,而方法1和方法3的弦則較不均勻。

隨機的弦的中點,方法1

隨機的弦的中點,方法2

隨機的弦的中點,方法3

隨機的弦,方法1

隨機的弦,方法2

隨機的弦,方法3

還可以想出許多其他的分布方法。每一種方法,其隨機的弦會比三角形的邊較長的機率都可能不一樣。

至今依舊無解。試想任一實數的『開區間』都可以對應整體實數 ,那麼『樣本空間』之『機率測度』能不謹慎乎?就像一個處處連續但卻處處不可微分的函数令人驚訝!

一八七二年,現代分析之父,德國的卡爾‧特奧多爾‧威廉‧魏爾斯特拉斯 Karl Theodor Wilhelm Weierstraß 給出一個處處連續但卻處處不可微分的這種非直覺性之函数︰

f(x)= \sum_{n=0} ^\infty a^n \cos(b^n \pi x),

其中 0<a<1, b 為正的奇數,使得:ab > 1+\frac{3}{2} \pi

─── 《時間序列︰伯特蘭悖論

 

這個伯特蘭悖論,維基百科給了個

傳統解答

問題的傳統解答認為關鍵在於「隨機」選擇弦的方法。若選定了隨機選擇的方法,問題自然也就會有良好定義的解答。既然不存在一個唯一的選擇方法,那麼也就不存在一個唯一的解答。伯特蘭提出的這三種解答分別對應不同的選擇方法,若沒有更進一步的資訊,也沒有理由認為其中的一個解答會比另一個解答更好。

機率論的傳統解釋所導致的伯特蘭悖論和其他悖論產生了幾個更嚴謹的範規,其中包括頻率機率貝葉斯機率

 

彷彿帶點『求仁得仁,求義得義』、『仁者見仁,智者見智』的味道!就像這位傑尼斯

Edwin Thompson Jaynes

Edwin Thompson Jaynes (July 5, 1922 – April 30,[1] 1998) was the Wayman Crow Distinguished Professor of Physics at Washington University in St. Louis. He wrote extensively on statistical mechanics and on foundations of probability and statistical inference, initiating in 1957 the MaxEnt interpretation of thermodynamics,[2][3] as being a particular application of more general Bayesian/information theory techniques (although he argued this was already implicit in the works of Gibbs). Jaynes strongly promoted the interpretation of probability theory as an extension of logic.

───

 

先生將波利亞之『似合理的』Plausible 『推理』系統化,

George_Pólya_ca_1973

喬治‧波利亞
George Pólya

How to Solve It

suggests the following steps when solving a mathematical problem:

1. First, you have to understand the problem.
2. After understanding, then make a plan.
3. Carry out the plan.
4. Look back on your work. How could it be better?

If this technique fails, Pólya advises: “If you can’t solve a problem, then there is an easier problem you can solve: find it.” Or: “If you cannot solve the proposed problem, try to solve first some related problem. Could you imagine a more accessible related problem?”

喬治‧波利亞長期從事數學教學,對數學思維的一般規律有深入的研究,一生推動數學教育。一九五四年,波利亞寫了兩卷不同於一般的數學書《Induction And_Analogy In Mathematics》與《Patterns Of Plausible Inference》探討『啟發式』之『思維樣態』,這常常是一種《數學發現》之切入點,也是探尋『常識徵候』中的『合理性』根源。舉個例子來說,典型的亞里斯多德式的『三段論』 syllogisms ︰

P \Longrightarrow Q
P 真, \therefore Q 真。

如果對比著『似合理的』Plausible 『推理』︰

P \Longrightarrow Q
Q 真, P 更可能是真。

這種『推理』一般稱之為『肯定後件Q 的『邏輯誤謬』。因為在『邏輯』上,這種『形式』的推導,並不『必然的』保障『歸結』一定是『』的。然而這種『推理形式』是完全沒有『道理』的嗎?如果從『三段論』之『邏輯』上來講,要是 Q 為『』,P 也就『必然的』為『』。所以假使 P 為『』之『必要條件Q 為『』,那麼 P 不該是『更可能』是『』的嗎??

─── 摘自《物理哲學·下中……

 

把『機率論』帶入邏輯殿堂,

Probability Theory: The Logic Of Science

he material available from this page is a pdf version of E.T. Jaynes’s book.

Introduction

Please note that the contents of the file from the link below is slightly of out sync with the actual contents of the book. The listing on this page correspond to the existing chapter order and names.

……

PT-1

PT-2

── 摘自《W!O+ 的《小伶鼬工坊演義》︰神經網絡【學而堯曰】七

 

想求得邏輯『唯一解』也!!

雖然傑尼斯  的論述很有啟發性︰

Jaynes’s solution using the “maximum ignorance” principle

In his 1973 paper “The Well-Posed Problem“,[2] Edwin Jaynes proposed a solution to Bertrand’s paradox, based on the principle of “maximum ignorance”—that we should not use any information that is not given in the statement of the problem. Jaynes pointed out that Bertrand’s problem does not specify the position or size of the circle, and argued that therefore any definite and objective solution must be “indifferent” to size and position. In other words: the solution must be both scale and translation invariant.

To illustrate: assume that chords are laid at random onto a circle with a diameter of 2, for example by throwing straws onto it from far away. Now another circle with a smaller diameter (e.g., 1.1) is laid into the larger circle. Then the distribution of the chords on that smaller circle needs to be the same as on the larger circle. If the smaller circle is moved around within the larger circle, the probability must not change either. It can be seen very easily that there would be a change for method 3: the chord distribution on the small red circle looks qualitatively different from the distribution on the large circle:

Bertrand3-translate ru.svg

The same occurs for method 1, though it is harder to see in a graphical representation. Method 2 is the only one that is both scale invariant and translation invariant; method 3 is just scale invariant, method 1 is neither.

However, Jaynes did not just use invariances to accept or reject given methods: this would leave the possibility that there is another not yet described method that would meet his common-sense criteria. Jaynes used the integral equations describing the invariances to directly determine the probability distribution. In this problem, the integral equations indeed have a unique solution, and it is precisely what was called “method 2” above, the random radius method.

 

,但是我們事實無法先驗的知道大自然會有多種統計學︰

麥克斯韋-玻爾茲曼統計

費米-狄拉克統計

玻色-愛因斯坦統計

 

因此,統計之觀察、推論以及分析,豈能不注意其『假設檢定』的程序乎?

Statistical hypothesis testing

A statistical hypothesis, sometimes called confirmatory data analysis, is a hypothesis that is testable on the basis of observing a process that is modeled via a set of random variables.[1] A statistical hypothesis test is a method of statistical inference. Commonly, two statistical data sets are compared, or a data set obtained by sampling is compared against a synthetic data set from an idealized model. A hypothesis is proposed for the statistical relationship between the two data sets, and this is compared as an alternative to an idealized null hypothesis that proposes no relationship between two data sets. The comparison is deemed statistically significant if the relationship between the data sets would be an unlikely realization of the null hypothesis according to a threshold probability—the significance level. Hypothesis tests are used in determining what outcomes of a study would lead to a rejection of the null hypothesis for a pre-specified level of significance. The process of distinguishing between the null hypothesis and the alternative hypothesis is aided by identifying two conceptual types of errors (type 1 & type 2), and by specifying parametric limits on e.g. how much type 1 error will be permitted.

An alternative framework for statistical hypothesis testing is to specify a set of statistical models, one for each candidate hypothesis, and then use model selection techniques to choose the most appropriate model.[2] The most common selection techniques are based on either Akaike information criterion or Bayes factor.

Confirmatory data analysis can be contrasted with exploratory data analysis, which may not have pre-specified hypotheses.

Cautions

“If the government required statistical procedures to carry warning labels like those on drugs, most inference methods would have long labels indeed.”[15] This caution applies to hypothesis tests and alternatives to them.

The successful hypothesis test is associated with a probability and a type-I error rate. The conclusion might be wrong.

The conclusion of the test is only as solid as the sample upon which it is based. The design of the experiment is critical. A number of unexpected effects have been observed including:

  • The clever Hans effect. A horse appeared to be capable of doing simple arithmetic.
  • The Hawthorne effect. Industrial workers were more productive in better illumination, and most productive in worse.
  • The placebo effect. Pills with no medically active ingredients were remarkably effective.

A statistical analysis of misleading data produces misleading conclusions. The issue of data quality can be more subtle. In forecasting for example, there is no agreement on a measure of forecast accuracy. In the absence of a consensus measurement, no decision based on measurements will be without controversy.

The book How to Lie with Statistics[16][17] is the most popular book on statistics ever published.[18] It does not much consider hypothesis testing, but its cautions are applicable, including: Many claims are made on the basis of samples too small to convince. If a report does not mention sample size, be doubtful.

Hypothesis testing acts as a filter of statistical conclusions; only those results meeting a probability threshold are publishable. Economics also acts as a publication filter; only those results favorable to the author and funding source may be submitted for publication. The impact of filtering on publication is termed publication bias. A related problem is that of multiple testing (sometimes linked to data mining), in which a variety of tests for a variety of possible effects are applied to a single data set and only those yielding a significant result are reported. These are often dealt with by using multiplicity correction procedures that control the family wise error rate (FWER) or the false discovery rate (FDR).

Those making critical decisions based on the results of a hypothesis test are prudent to look at the details rather than the conclusion alone. In the physical sciences most results are fully accepted only when independently confirmed. The general advice concerning statistics is, “Figures never lie, but liars figure” (anonymous).

……

Courtroom trial

A statistical test procedure is comparable to a criminal trial; a defendant is considered not guilty as long as his or her guilt is not proven. The prosecutor tries to prove the guilt of the defendant. Only when there is enough evidence for the prosecution is the defendant convicted.

In the start of the procedure, there are two hypotheses \displaystyle H_{0} : “the defendant is not guilty”, and \displaystyle H_{1} : “the defendant is guilty”. The first one, \displaystyle H_{0} , is called the null hypothesis, and is for the time being accepted. The second one, \displaystyle H_{1} , is called the alternative hypothesis. It is the alternative hypothesis that one hopes to support.

The hypothesis of innocence is only rejected when an error is very unlikely, because one doesn’t want to convict an innocent defendant. Such an error is called error of the first kind (i.e., the conviction of an innocent person), and the occurrence of this error is controlled to be rare. As a consequence of this asymmetric behaviour, an error of the second kind (acquitting a person who committed the crime), is more common.

  H0 is true
Truly not guilty
H1 is true
Truly guilty
Accept null hypothesis
Acquittal
Right decision Wrong decision
Type II Error
Reject null hypothesis
Conviction
Wrong decision
Type I Error
Right decision

A criminal trial can be regarded as either or both of two decision processes: guilty vs not guilty or evidence vs a threshold (“beyond a reasonable doubt”). In one view, the defendant is judged; in the other view the performance of the prosecution (which bears the burden of proof) is judged. A hypothesis test can be regarded as either a judgment of a hypothesis or as a judgment of evidence.

………

Philosopher’s beans

The following example was produced by a philosopher describing scientific methods generations before hypothesis testing was formalized and popularized.[29]

Few beans of this handful are white.
Most beans in this bag are white.
Therefore: Probably, these beans were taken from another bag.
This is an hypothetical inference.

The beans in the bag are the population. The handful are the sample. The null hypothesis is that the sample originated from the population. The criterion for rejecting the null-hypothesis is the “obvious” difference in appearance (an informal difference in the mean). The interesting result is that consideration of a real population and a real sample produced an imaginary bag. The philosopher was considering logic rather than probability. To be a real statistical hypothesis test, this example requires the formalities of a probability calculation and a comparison of that probability to a standard.

A simple generalization of the example considers a mixed bag of beans and a handful that contain either very few or very many white beans. The generalization considers both extremes. It requires more calculations and more comparisons to arrive at a formal answer, but the core philosophy is unchanged; If the composition of the handful is greatly different from that of the bag, then the sample probably originated from another bag. The original example is termed a one-sided or a one-tailed test while the generalization is termed a two-sided or two-tailed test.

The statement also relies on the inference that the sampling was random. If someone had been picking through the bag to find white beans, then it would explain why the handful had so many white beans, and also explain why the number of white beans in the bag was depleted (although the bag is probably intended to be assumed much larger than one’s hand).

 

也許可以藉著『議題』︰

淑女品茶是一個有關假設檢定的著名例子[2],費雪的一個女同事聲稱可以判斷在奶茶中,是先加入茶還是先加入牛奶。費雪提議給她八杯奶茶,四杯先加茶,四杯先加牛奶,但隨機排列,而女同事要說出這八杯奶茶中,哪些先加牛奶,哪些先加茶,檢驗統計量是確認正確的次數。零假設是女同事無法判斷奶茶中的茶先加入還是牛奶先加入,對立假設為女同事有此能力。

若單純以機率考慮(即女同事沒有判斷的能力)下,八杯都正確的機率為1/70,約1.4%,因此「拒絕域」為八杯的結果都正確。而測試結果為女同事八杯的結果都正確[3],在統計上是相當顯著的的結果。

Lady tasting tea

In the design of experiments in statistics, the lady tasting tea is a randomized experiment devised by Ronald Fisher and reported in his book The Design of Experiments (1935).[1] The experiment is the original exposition of Fisher’s notion of a null hypothesis, which is “never proved or established, but is possibly disproved, in the course of experimentation”.[2][3]

The lady in question (Muriel Bristol) claimed to be able to tell whether the tea or the milk was added first to a cup. Fisher proposed to give her eight cups, four of each variety, in random order. One could then ask what the probability was for her getting the specific number of cups she identified correct, but just by chance.

Fisher’s description is less than 10 pages in length and is notable for its simplicity and completeness regarding terminology, calculations and design of the experiment.[4] The example is loosely based on an event in Fisher’s life. The test used was Fisher’s exact test.

The experiment asked whether a taster could tell if the milk was added before the brewed tea, when preparing a cup of tea

 

鍛鍊『統計思維』!!

或許也該參考『極點』︰

有些長期從事科學的教育者,發現數理學習的困難度,可以排列成『邏輯』<『數學』<『機率』這樣的次序。這可讓人覺得十分有意思,難道是說『必然的』<『抽象的』<『不確定』?或許人們不能輕易覺察之『無意識』的『參照點』就是對事物觀點『兩極化』的由來。就好像在《改不改??變不變!!》一文中所談到的一些『悖論』彷彿是『腦筋急轉彎』的一般!

250px-Quixo-panza

唐吉訶德‧大戰風車

西班牙作家塞萬提斯名著
《唐吉訶德》開場白︰

En un lugar de la Mancha, de cuyo nombre no quiero acordarme, no ha mucho tiempo que vivía un hidalgo de los de lanza en astillero, adarga antigua, rocín flaco y galgo corredor.

曼查有個地方,地名就不用提了,不久前住著一位貴族。他那樣的貴族,矛架上有一支長矛,還有一面皮盾、一匹瘦馬和一隻獵兔狗。

唐吉訶德》裡有一段,說︰

桑丘‧潘薩在他治理的島上頒布一條法例,規定過橋的旅客必需誠實地表示自己的目的,否則就要接受絞刑。 有一個旅客在見到橋上的告示後,宣稱自己過橋是要接受絞刑的。

這使執法者感到為難:如果旅客的言論為真,則他應被釋放並不得受絞刑,但如此一來旅客言論即變為假。如其言論為假,則他會被絞死,但如此一來其言論即變為真。該旅客被帶到桑丘面前,而桑丘最後把他釋放。

一六零五年始,塞萬提斯寫了一本『反騎士』的小說,他怎知四百年後,美國的百老匯將其『唐吉訶德』變裝成了『逐夢者』。『正港』是有著『阿 Q 精神』的『夢幻騎士』,勇往直前『非理性』的『挑戰』當代社會中的一切『不合理性』現象,絕不退縮。宛如許多的『悖論』常起源於『自我指涉』之『誤謬』,為什麼呢?

人的行為』並不外於他的『整體行為』,『人之言思』也屬於他的『全部言思』。於是乎,在『日久天長』中,難到不可能發生,有『一鬼』以為︰它並非『已死之人』,又有『一人』認為︰他就是『活著的鬼』。如果說︰昨日之我,譬如今日死;那麼今日之我,就將明日亡。如是議論,人麽真的能理解『時間』是什麼?『光陰』之所以是『公平』的,或許祇在於『如何使用』操之在人的吧!就像一個『推動』社會的『改革者』,期盼能免於︰先生,他們敗壞了所有價值,我們打垮了他們,不過我們也不知道如何重建那些他們曾經打垮之價值的哩!!

─── 摘自《物理哲學·下中………

 

足以『腦筋急轉彎』??

不過還是先全面理解好☆