arafathossain
Dołączył: 30 Gru 2024 Posty: 1
|
Wysłany: Pon Gru 30, 2024 06:06 Temat postu: Using the Random Forest Algorithm to Study Ranking Factors |
|
|
Jokes aside, this post is for real nerds, so here's a quick glossary :
Decision tree - a tree-like structure representing a machine learning algorithm typically applied to classification tasks. It divides a sample dataset into homogeneous groups/subsets, based on the most significant attributes.
Supervised machine learning - a type of machine learning algorithm that defines a model to find relationships, linear or not, in the relationship between input variables (features, A) and output variable (target value, B): B = f (A) . The goal of supervised learning is to train this model on a sample of the data so that when out-of-sample data is analyzed, the algorithm can accurately predict the crawler data target value, given the given settings. The training dataset is the teacher who supervises the learning process. Training ends when the algorithm achieves an acceptable performance quality.
Features (or variables, or input variables) - are the independent variables used in the analysis. For our study and this post, features are the supposed ranking factors.
Binary classification - a type of classification task that falls under the category of supervised learning. The goal of this task is to predict a target value (= class) for each input data, and for binary classification it can only be 1 or 0. _________________ crawler data |
|