DataStory - Scikit-Learn

Scikit-Learn ბიბლიოთეკა

გამოიყენება მანქანური სწავლებისთვის.

Purpose: Splits a dataset into training and testing subsets.

Purpose: Evaluates a model's performance using cross-validation.

from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score

Purpose: Converts categorical data into a binary (one-hot encoded) format.
Purpose: Encodes target labels with values between 0 and n_classes-1.
Purpose: Converts categorical data into numeric values, preserving a meaningful order between categories.

from sklearn.preprocessing import OneHotEncoder

from sklearn.preprocessing import LabelEncoder

from sklearn.preprocessing import OrdinalEncoder

Purpose: An under-sampling technique to balance the dataset by reducing the majority class.
Purpose: A hybrid sampling technique that combines SMOTE (over-sampling) and Tomek Links (under-sampling) to balance the dataset.
Purpose: A simple over-sampling technique to balance the dataset by duplicating samples from the minority class.

from imblearn.under_sampling import NearMiss
from imblearn.combine import SMOTETomek
from imblearn.over_sampling import

RandomOverSampler

Purpose: A statistical test for assessing the relationship between categorical features and a target variable.
Purpose: Incrementally selects or removes features to find the best subset for a model.
Purpose: Evaluates all possible combinations of features to find the best subset.
Purpose: Measures the average squared difference between actual and predicted values in regression tasks.
Purpose: Selects features based on their importance weights from a fitted model.

from sklearn.feature_selection import chi2
from mlxtend.feature_selection import

SequentialFeatureSelector

from mlxtend.feature_selection import

ExhaustiveFeatureSelector

from sklearn.metrics import mean_squared_error
from sklearn.feature_selection import SelectFromModel

Purpose: Scales and translates each feature to a given range (default is [0, 1]).
Purpose: Scales features to have zero mean and unit variance (standard normal distribution).

from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler

Purpose:

Automates the process of hyperparameter tuning by exhaustively searching over a specified parameter grid.
Finds the best combination of hyperparameters for a machine learning model using cross-validation

from sklearn.model_selection import GridSearchCV

Purpose:

Automates the process of hyperparameter tuning by exhaustively searching over a specified parameter grid.
Finds the best combination of hyperparameters for a machine learning model using cross-validation

from sklearn.model_selection import GridSearchCV

Purpose:

Combines multiple weak learners (usually decision stumps, i.e., shallow decision trees) to create a strong classifier.
Focuses on correcting errors made by previous weak learners.

Purpose:

Builds an ensemble of decision trees, where each tree corrects the errors of the combined predictions of all previous trees.
Uses a gradient descent approach to minimize the loss function (e.g., log loss for classification).

from sklearn.ensemble import AdaBoostClassifier
from sklearn.ensemble import GradientBoostingClassifier

Purpose:

Combines multiple weak learners (usually decision stumps, i.e., shallow decision trees) to create a strong classifier.
Focuses on correcting errors made by previous weak learners.

Purpose:

Builds an ensemble of decision trees, where each tree corrects the errors of the combined predictions of all previous trees.
Uses a gradient descent approach to minimize the loss function (e.g., log loss for classification).

from sklearn.ensemble import AdaBoostClassifier
from sklearn.ensemble import GradientBoostingClassifier

Purpose: A regression algorithm for predicting a continuous target variable.

Purpose: A classification algorithm for binary or multi-class problems.

Purpose: A classification algorithm that uses a tree-like structure to make decisions.

Purpose: A non-parametric classification algorithm based on proximity to training samples.

Purpose: A classification algorithm that separates data using hyperplanes.

Purpose: A probabilistic classification algorithm based on Bayes' theorem.

Purpose: An ensemble classification algorithm that builds multiple decision trees.

from sklearn.linear_model import LinearRegression

from sklearn.linear_model import LogisticRegression

from sklearn.tree import DecisionTreeClasifier

from sklearn.neighbors import KneighborsClassisier

from sklearn.SVM import SVC

from sklearn.naive_bayes import GaussianNB
from sklearn.ensemble import RandomForestClassifier

Purpose: Combines the predictions of different estimators to make a final prediction. (1.Hard 2.Soft)

from sklearn.ensemble import VotingClassifier

Purpose: Combines the predictions of different estimators to make a final prediction. (1.Hard 2.Soft)

from sklearn.ensemble import VotingClassifier

Purpose: Measures the proportion of correct predictions made by a model.
Purpose: Provides a matrix of actual vs. predicted classifications, giving a deeper view of a model's performance.
Purpose: Provides a detailed report with multiple evaluation metrics such as precision, recall, F1 score, and support for each class.

from sklearn.metrics import
accuracy_score ,
confusion_matrix ,
classification_report

Page updated

Google Sites

Report abuse