Principle Component Analysis

Categories: Feature selection

Exposed methods:

Feature selection
Input:
Output:
Input format:	Weka's ARFF format
Output format:	Weka's ARFF format
User-specified parameters:	Variance covered Maximum number of attributes to include in transformation
Reporting information:	The optimal subset of variables

Description:

The Principle Component Analysis (PCA) is mathematically defined as an orthogonal linear transformation that
transforms the data to a new coordinate system such that the greatest variance by any projection of the data
comes to lie on the first coordinate, the second greatest variance on the second coordinate and so forth. The
coordinates are here called principal components.

Background (publication date, popularity/level of familiarity, rationale of approach, further comments)
PCA is closely related to factor analysis; synonyms: Karhunen-Loève transform (KLT),
Hotelling transform or proper orthogonal decomposition (POD);

Class-blind/class-sensitive feature selection
Class-blind

Type (optimal, greedy, randomized)
Optimal (PCA is theoretically the optimum transform for a given data in least square terms)

Filter/wrapper/hybrid approach
Filter

Type of Descriptor:

Interfaces:

Priority: Medium

Development status:

Homepage:

Dependencies:
External components: WEKA

Technical details

Data: No

Software: Yes

Programming language(s): Java

Operating system(s): Linux, Win, Mac OS

Input format: Weka's ARFF format

Output format: Weka's ARFF format

License: GPL

References

References:
[PEA01] Pearson, K., On Lines and Planes of Closest Fit to Systems of Points in Space, Philosophical Magazine, 2 (6): 559-572.

Sections

Principle Component Analysis

Technical details

References

Document Actions