Data Mining

Data Mining

Data mining, also known as knowledge discovery in data (KDD) is most commonly defined as the process to search large sets of data for patterns and trends, turning those findings into business insights and predictions.  Data mining goes beyond the search process, as it uses data to evaluate future probabilities and develop actionable analyses.

Phases of Data Mining

  • Business Understanding
  • Data Analysis
  • Data Acquisition
  • Data Cleansing
  • Data Preparation
  • Data Modelling
  • Data Transformation
  • Data Classification
  • Data Forecasting
  • Data Reporting


No alt text provided for this image

Data Mining Process Models

Cross-industry standard process (CRISP) is a reliable and secure data mining model that offers a well organized method for the process of mining the data.

SEMMA (Sample, Explore, Modify, Model, Assess) developed by SAS Institute which allows users to apply visual and exploratory techniques which are used to select and transform the predicted variables and construct models using these variables.

Data Mining Challenges

No alt text provided for this image

Mining various types of knowledge in databases - The requirements of different users differ. Different types of knowledge may pique the interest of different users. As a result, data mining must cover a wide range of knowledge discovery tasks.

Interactive knowledge mining at multiple levels of abstraction - Because it allows users to focus on searching for patterns, providing and refining data mining requests based on returned results, the data mining process must be interactive.

Background Knowledge - This can be used to express discovered patterns not only in concise terms but at multiple levels of abstraction to guide the discovery process and express discovered patterns.

Ad-hoc data mining and data mining query languages - A data mining query language that allows users to describe ad-hoc mining tasks should be integrated with a data warehouse query language and optimized for efficient and flexible data mining.

Data mining results presentation and visualization - Once patterns are identified, they must be expressed in high-level languages and visual representations. Users should be able to easily understand these representations.

Handling noisy or incomplete data - Data cleaning methods that can handle noise and incomplete objects while mining data regularities are required. Without data cleaning methods, the accuracy of discovered patterns will be low.

Pattern evaluation - This refers to the problem's interest. The discovered patterns should be interesting because they either represent common knowledge or a lack of novelty.

Data Mining Tools

  • IBM SPSS
  • Amazon EMR
  • SAS
  • Oracle Data Mining
  • KNIME
  • Rapid Miner
  • Orange
  • Qlik View
  • SSDT

To view or add a comment, sign in

More articles by Mohammad Rafi Aamiri

  • Enable Digital Assistant in Oracle Fusion

    Below is the Payables invoice screen without Digital Assistant enabled for an user. Let us discuss how to enable for…

    1 Comment
  • VBCS as Progressive Web App

    PWAs are web apps that use service workers, manifests, and other web-platform features in combination with progressive…

  • MoSCoW Prioritization Model

    MoSCoW is an acronym that stands for Must have, Should have, Could have, and Won't have. These are the four categories…

  • Progressive Web Application (PWA)

    A progressive web application (PWA), commonly known as a progressive web app, is a type of application software…

  • Internet2

    Internet2 is a not-for-profit United States computer networking consortium led by members from the research and…

  • Oracle Data Management Platform (Formerly BlueKai)

    Oracle DMP (formerly BlueKai) is the industry’s leading cloud-based big data platform that enables marketing…

  • Simple Oracle Document Access (SODA)

    Simple Oracle Document Access (SODA) is a set of NoSQL-style APIs that let you create and store collections of…

    1 Comment
  • Find Table Names in Fusion Page

    STEP 1 : Open the Fusion UI where you want to know the tables or views or synonyms. In my case I used the Enterprise…

    32 Comments
  • Oracle SQL Developer Web 20.4

    SQL Developer Web is provided as a feature of Oracle REST Data Services (ORDS). ORDS provides database professionals a…

    1 Comment
  • Pixel Perfect Reporting in Oracle Analytics Cloud

    Oracle Analytics Cloud (OAC) is a scalable and secure public cloud service that provides a full set of capabilities to…

Insights from the community

Others also viewed

Explore topics