Sections
You are here: Home » Development » Documentation » Components » Principle Component Analysis

Principle Component Analysis

Contact: Stefan Kramer

Categories: Feature selection

Exposed methods:

Feature selection

Input:
Output:
Input format: Weka's ARFF format
Output format: Weka's ARFF format
User-specified parameters: Variance covered Maximum number of attributes to include in transformation
Reporting information: The optimal subset of variables

Description:

The Principle Component Analysis (PCA) is mathematically defined as an orthogonal linear transformation that
transforms the data to a new coordinate system such that the greatest variance by any projection of the data
comes to lie on the first coordinate, the second greatest variance on the second coordinate and so forth. The
coordinates are here called principal components.

Background (publication date, popularity/level of familiarity, rationale of approach, further comments)
PCA is closely related to factor analysis; synonyms: Karhunen-Loève transform (KLT),
Hotelling transform or proper orthogonal decomposition (POD);

Class-blind/class-sensitive feature selection
Class-blind

Type (optimal, greedy, randomized)
Optimal (PCA is theoretically the optimum transform for a given data in least square terms)

Filter/wrapper/hybrid approach
Filter

Type of Descriptor:

Interfaces:

Priority: Medium

Development status:

Homepage:

Dependencies:
External components: WEKA


Technical details

Data: No

Software: Yes

Programming language(s): Java

Operating system(s): Linux, Win, Mac OS

Input format: Weka's ARFF format

Output format: Weka's ARFF format

License: GPL


References

References:
[PEA01] Pearson, K., On Lines and Planes of Closest Fit to Systems of Points in Space, Philosophical Magazine, 2 (6): 559-572.

Document Actions