AIDive
Back to glossary

What is Data Mining

GlossaryData Science

Find patterns, groups, connections and useful signals in large data sets.

Definition

Data Mining is the search for patterns, groups, relationships, and useful signals in large data sets. Simply put, this concept helps you work with data as the basis for analytics, recommendations, and models. In practice, it helps to understand what capabilities the tool actually has, what data it will need, and what limitations are worth checking before implementation.

Example

The retailer analyzes purchases and finds items that are often purchased together.

Why it matters

The method is useful for finding hypotheses, segments and features, which can then be used in AI. This helps you choose AI tools not by big promises, but by how they work in a real problem.

How it works

Data is collected, cleaned, described, transformed and analyzed to produce a robust conclusion or prepare a model. In the case of the term “Data Mining”, it is important to look separately at the data, quality criteria and application conditions.

Where it is used

  • Used in analytics, data preparation, pattern finding, reporting, forecasting and model building.

Limitations

Even careful analysis can be flawed if the data is biased, outdated, poorly cleaned, or misinterpreted.