• When there is bias in the data, measure the precision and recall using the smaller value.

  • Use R^2.

    • The R^2 value represents the accuracy of the model (regression only).
    • Ranges from 0 to 1.
    • A large difference between the values of the dataset and the test set is not good.
  • image

  • Cross-validation

    • Instead of simply dividing it into 2:8, try various patterns.
    • There are different division methods depending on the situation.
      • For example, when you want to include a specific group in all test cases.
      • When you have a large amount of data and want to use only a part of it.
        • Various methods are described in Chapter 5.1 of the book.
  • It is not enough to compare accuracy numbers; be flexible according to the actual usage.

    • For example, in the medical field, it is clear that false negatives are more dangerous than false positives, so the two types of mistakes should not be treated with the same weight.

Supervised Learning

Machine Learning

Introduction to Machine Learning with Python