In machine learning, pattern recognition and in image processing, feature extraction starts from an initial set of measured data and builds derived values (features) intended to be informative and non-redundant, facilitating the subsequent learning and generalization steps, and in some cases leading to better human interpretations. Feature extraction is related to dimensionality reduction.
When the input data to an algorithm is too large to be processed and it is suspected to be redundant (e.g. the same measurement in both feet and meters, or the repetitiveness of images presented as pixels), then it can be transformed into a reduced set of features (also named a features vector). This process is called feature selection. The selected features are expected to contain the relevant information from the input data, so that the desired task can be performed by using this reduced representation instead of the complete initial data.
A cepstrum (/ˈkɛpstrəmˈˌˈsɛpstrəmˈ/) is the result of taking the Inverse Fourier transform (IFT) of the logarithm of the estimated spectrum of a signal. It may be pronounced in the two ways given, the second having the advantage of avoiding confusion with ‘kepstrum’ which also exists (see below). There is a complex cepstrum, a real cepstrum, a power cepstrum, and a phase cepstrum. The power cepstrum in particular finds applications in the analysis of human speech.
The name “cepstrum” was derived by reversing the first four letters of “spectrum”. Operations on cepstra are labelled quefrency analysis (aka quefrency alanysis), liftering, or cepstral analysis.
Steps in forming cepstrum from time history
Feature engineering is the process of using domain knowledge of the data to create features that make machine learning algorithms work. Feature engineering is fundamental to the application of machine learning, and is both difficult and expensive. The need for manual feature engineering can be obviated by automated feature learning.
Feature engineering is an informal topic, but it is considered essential in applied machine learning.
Coming up with features is difficult, time-consuming, requires expert knowledge. “Applied machine learning” is basically feature engineering.
A feature is an attribute or property shared by all of the independent units on which analysis or prediction is to be done. Any attribute could be a feature, as long as it is useful to the model.
The purpose of a feature, other than being an attribute, would be much easier to understand in the context of a problem. A feature is a characteristic that might help when solving the problem.
Importance of features
The features in your data are important to the predictive models you use and will influence the results you are going to achieve. The quality and quantity of the features will have great influence on whether the model is good or not.
You could say the better the features are, the better the result is. This isn’t entirely true, because the results achieved also depend on the model and the data, not just the chosen features. That said, choosing the right features is still very important. Better features can produce simpler and more flexible models, and they often yield better results.
The algorithms we used are very standard for Kagglers. […] We spent most of our efforts in feature engineering. […] We were also very careful to discard features likely to expose us to the risk of over-fitting our model.— Xavier Conort, “Q&A with Xavier Conort”
…some machine learning projects succeed and some fail. What makes the difference? Easily the most important factor is the features used.— Pedro Domingos, “A Few Useful Things to Know about Machine Learning”
||Compute a chromagram from a waveform or power spectrogram.|
||Computes the chroma variant “Chroma Energy Normalized” (CENS), following [R15].|
||Compute a mel-scaled spectrogram.|
||Mel-frequency cepstral coefficients|
||Compute root-mean-square (RMS) energy for each frame, either from the audio samples y or from a spectrogram S.|
||Compute the spectral centroid.|
||Compute p’th-order spectral bandwidth:|
||Compute spectral contrast [R16]|
||Compute roll-off frequency|
||Get coefficients of fitting an nth-order polynomial to the columns of a spectrogram.|
||Computes the tonal centroid features (tonnetz), following the method of [R17].|
||Compute the zero-crossing rate of an audio time series.|
||Compute the tempogram: local autocorrelation of the onset strength envelope.|