(Information Science Guru) Machine Learning

  • Represent the inclusion relationships of each pattern as a graph.
    • Pattern: a set of elements.
      • Example: In purchase data, a set of items bought by a customer.
  • Why use this representation?
    • When you want to know the frequency or probability of each pattern (e.g., the frequency/probability of each item in purchase data).
    • It takes times to perform exhaustive search.
    • By using the Apriori algorithm,
      • Since the inclusion relationships are known as a graph, there is no need to calculate the frequency/probability for all patterns.
      • The frequency can be obtained by summing all the nodes below.
      • The probability distribution can also be obtained using the Boltzmann machine method.
        • It provides a probability distribution, not an empirical distribution.
          • Empirical distribution: plotting the available data as is.
          • Probability distribution: a distribution that has been adjusted to be closer to the true distribution based on the relationships between nodes. #Pattern Mining