Skip to content

Commit

Permalink
yo
Browse files Browse the repository at this point in the history
  • Loading branch information
fderyckel committed Apr 21, 2024
1 parent 2158c49 commit 8e3cab1
Show file tree
Hide file tree
Showing 20 changed files with 14 additions and 0 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"hash": "924d0a3cf1e163dd82c56852b67cd6a2",
"result": {
"markdown": "---\ntitle: \"Defining Success\"\nauthor: \"Francois de Ryckel\"\ndate: \"2024-04-16\"\ncategories: [sklearn, tidymodel]\neditor: source\ndate-modified: '2024-04-20'\nexecute:\n cache: true\n---\n\n\nWhen evaluating models for a given ML algorithm, we need to define in advance what would be our metric to measure success. How would we decide if this models is better than this model? Or even which are the hyper-parameters that fine-tuned a model better? \n\nThis post is about defining what is *'best'* or *'better'* when comparing different **supervised models**. we'll have 2 main parts: measure of success for regression models and measure of success for classification models. \n\n# Regression models \n\nWhen modeling for regression, we somehow **measure the distance between our prediction and the actual observed value**. When comparing models, we usually want to keep the model which give the smallest sum of distance. \n\n## RMSE\n\nThis is probably the most well-known measure when comparing regression models. Because we are squaring the distance between the predicted and the observed, this penalizes predicted values that are far off the real values. Hence this measures is used when we want to avoid 'outlier' predictions (prediction that are far off.)\n\n$$RMSE = \\sqrt \\frac{\\sum_{i=1}^{n}(y_i - \\hat{y}_i)^2}{n}$$\n\n## MAE\n\nWith **Mean Absolute Error**, we just take the average of the errors. Useful when we don't really care if predictions is far off from the observed data. \n\n$$MAE = \\frac {\\sum_{i=1}^{n} \\lvert y_i - \\hat{y}_i \\rvert}{n}$$\n\n## Huber Loss\n\nHuber loss is a mixture of RMSE and MAE. Kind of the best of both world basically. \n\n$$$$\n\n\n# Classfication models \n\n## Accuracy \n\nShortcomings: \n\n* for imbalanced dataset, we can have good accuracy by just predicting most observation with the most frequent class. For instance in the case of a rare disease or big financial meltdown, we can just predict \n\n## Precision \n\nIf you call it true, is it indeed true? In other words, the proportion of predicted positive that are actually positive. \n\n## Recall \n\nIf there is a positive, did the model predict a positive. \n\n\n## F1 score \n\nIt is the **harmonic mean** of both precision and recall. The harmonic mean penalizes model that have very low precision or recall. Which wouldn't be the case with arithmetic mean. \n\n$$\\frac{2 \\cdot Precision \\cdot Recall}{Precision + Recall}$$\n\n## AUC & ROC Curve\n\nneed to get the prediction as a probability \n\n::: {.cell hash='index_cache/html/unnamed-chunk-1_f8febe300bdf5210f7934e8f7c002c44'}\n\n```{.r .cell-code}\nlibrary(yardstick)\n```\n:::\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {},
"engineDependencies": {},
"preserve": {},
"postProcess": true
}
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

0 comments on commit 8e3cab1

Please sign in to comment.