This presentation gives the idea about Data Preprocessing in the field of Data Mining. Images, examples and other things are adopted from "Data Mining Concepts and Techniques by Jiawei Han, Micheline Kamber and Jian Pei "
The document discusses deductive databases and how they differ from conventional databases. Deductive databases contain facts and rules that allow implicit facts to be deduced from the stored information. This reduces the amount of storage needed compared to explicitly storing all facts. Deductive databases use logic programming through languages like Datalog to specify rules that define virtual relations. The rules allow new facts to be inferred through an inference engine even if they are not explicitly represented.
Bayesian classification is a statistical classification method that uses Bayes' theorem to calculate the probability of class membership. It provides probabilistic predictions by calculating the probabilities of classes for new data based on training data. The naive Bayesian classifier is a simple Bayesian model that assumes conditional independence between attributes, allowing faster computation. Bayesian belief networks are graphical models that represent dependencies between variables using a directed acyclic graph and conditional probability tables.
This document discusses different types of knowledge and methods for knowledge acquisition. It describes declarative and procedural knowledge, as well as the knowledge acquisition paradox where experts have difficulty verbalizing their knowledge. Various knowledge acquisition methods are outlined, including observation, problem discussion, and protocol analysis. Knowledge representation techniques like rules, semantic networks, frames, and predicate logic are also introduced.
The document discusses the process of knowledge discovery in databases (KDP). It provides the following key points:
1. KDP involves discovering useful information from data through steps like data cleaning, transformation, mining and pattern evaluation.
2. Several KDP models have been developed, including academic models with 9 steps, industrial models with 5-6 steps, and hybrid models combining aspects of both.
3. A widely used model is CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining and has 6 steps: business understanding, data understanding, data preparation, modeling, evaluation and deployment.
Decision trees are a type of supervised learning algorithm used for classification and regression. ID3 and C4.5 are algorithms that generate decision trees by choosing the attribute with the highest information gain at each step. Random forest is an ensemble method that creates multiple decision trees and aggregates their results, improving accuracy. It introduces randomness when building trees to decrease variance.
The document discusses classical AI planning and different planning approaches. It introduces state-space planning which searches for a sequence of state transformations, and plan-space planning which searches for a plan satisfying certain conditions. It also discusses hierarchical planning which decomposes tasks into simpler subtasks, and universal classical planning which uses different refinement techniques including state-space and plan-space refinements. Classical planning makes simplifying assumptions but its principles can still be applied to games with some workarounds.
Web mining is the application of data mining techniques to extract knowledge from web data, including web content, structure, and usage data. Web content mining analyzes text, images, and other unstructured data on web pages using natural language processing and information retrieval. Web structure mining examines the hyperlinks between pages to discover relationships. Web usage mining applies data mining methods to server logs and other web data to discover patterns of user behavior on websites. Text mining aims to extract useful information from unstructured text documents using techniques like summarization, information extraction, categorization, and sentiment analysis.
The document discusses database backup and recovery basics. It defines redo log files and archived log files, with redo logs recording changes made to the database for recovery and archived logs copying redo log contents for recovery. It also covers the goals of database administrators to keep databases available, types of backups (physical and logical), categories of failures (media failures and user errors), configuring for recoverability including archive log files, and the differences between no archive log mode and archive log mode.
The document defines data mining as extracting useful information from large datasets. It discusses two main types of data mining tasks: descriptive tasks like frequent pattern mining and classification/prediction tasks like decision trees. Several data mining techniques are covered, including association, classification, clustering, prediction, sequential patterns, and decision trees. Real-world applications of data mining are also outlined, such as market basket analysis, fraud detection, healthcare, education, and CRM.
This document discusses association rule mining. Association rule mining finds frequent patterns, associations, correlations, or causal structures among items in transaction databases. The Apriori algorithm is commonly used to find frequent itemsets and generate association rules. It works by iteratively joining frequent itemsets from the previous pass to generate candidates, and then pruning the candidates that have infrequent subsets. Various techniques can improve the efficiency of Apriori, such as hashing to count itemsets and pruning transactions that don't contain frequent itemsets. Alternative approaches like FP-growth compress the database into a tree structure to avoid costly scans and candidate generation. The document also discusses mining multilevel, multidimensional, and quantitative association rules.
This document discusses inference in first-order logic. It defines sound and complete inference and introduces substitution. It then discusses propositional vs first-order inference and introduces universal and existential quantifiers. The key techniques of first-order inference are unification, which finds substitutions to make logical expressions identical, and forward chaining inference, which applies rules like modus ponens to iteratively derive new facts from a knowledge base.
This document provides an overview of IT infrastructure architecture and networking building blocks and concepts. It discusses the evolution from mainframe computers to local area networks and the internet. The key networking concepts covered include the OSI reference model, physical layer components like cables, patch panels and network interface cards, as well as datalink layer protocols like Ethernet and Wi-Fi.
The document discusses the key components of a database system environment: hardware, software, people, procedures, and data. It describes hardware as the physical devices like computers. It explains that software includes operating systems, database management systems (DBMS), and application programs. People in the environment include administrators, designers, analysts, programmers, and end users. Procedures govern how the database system is designed and used. Data refers to the collection of facts stored in the database.
The document discusses planning and problem solving in artificial intelligence. It describes planning problems as finding a sequence of actions to achieve a given goal state from an initial state. Common assumptions in planning include atomic time steps, deterministic actions, and a closed world. Blocks world examples are provided to illustrate planning domains and representations using states, goals, and operators. Classical planning approaches like STRIPS are summarized.
This document provides an overview of data mining techniques and concepts. It defines data mining as the process of discovering interesting patterns and knowledge from large amounts of data. The key steps involved are data cleaning, integration, selection, transformation, mining, evaluation, and presentation. Common data mining techniques include classification, clustering, association rule mining, and anomaly detection. The document also discusses data sources, major applications of data mining, and challenges.
Distributed systems allow independent computers to appear as a single coherent system by connecting them through a middleware layer. They provide advantages like increased reliability, scalability, and sharing of resources. Key goals of distributed systems include resource sharing, openness, transparency, and concurrency. Common types are distributed computing systems, distributed information systems, and distributed pervasive systems.
An knowledge based system (KBS) is a type of artificial intelligence program that uses a knowledge base to solve problems within a specialized domain that normally requires human expertise. A KBS consists of a knowledge base containing facts, rules, and heuristics about its domain, an inference engine that applies reasoning to the knowledge base, and a user interface. The knowledge base is developed by a knowledge engineer working with a domain expert to capture their expertise. A KBS can perform tasks like classification, diagnosis and planning by drawing on the captured knowledge through its inference engine.
The document discusses Internet protocols and IPTables filtering. It provides an overview of Internet protocols, IP addressing, firewall utilities, and the different types of IPTables - Filter, NAT, and Mangle tables. The Filter table is used for filtering packets. The NAT table is used for network address translation. The Mangle table is used for specialized packet alterations. IPTables works by defining rules within chains to allow or block network traffic based on packet criteria.
There are three main points about data streams and stream processing:
1) A data stream is a continuous, ordered sequence of data items that arrives too rapidly to be stored fully. Common sources include sensors, web traffic, and social media.
2) Data stream management systems process continuous queries over streams in real-time using bounded memory. They provide summaries of historical data rather than storing entire streams.
3) Challenges of stream processing include limited memory, complex continuous queries, and unpredictable data rates and characteristics. Approximate query processing techniques like windows, sampling, and load shedding help address these challenges.
This document discusses data preprocessing techniques for data mining. It explains that real-world data is often dirty, containing issues like missing values, noise, and inconsistencies. Major tasks in data preprocessing include data cleaning, integration, transformation, reduction, and discretization. Data cleaning techniques are especially important and involve filling in missing values, identifying and handling outliers, resolving inconsistencies, and reducing redundancy from data integration. Other techniques discussed include binning data for smoothing noisy values and handling missing data through various imputation methods.
The document provides an introduction to distributed systems, defining them as a collection of independent computers that communicate over a network to act as a single coherent system. It discusses the motivation for and characteristics of distributed systems, including concurrency, lack of a global clock, and independence of failures. Architectural categories of distributed systems include tightly coupled and loosely coupled, with examples given of different types of distributed systems such as database management systems, ATM networks, and the internet.
Data warehousing combines data from multiple sources into a single database to provide businesses with analytics results from data mining, OLAP, scorecarding and reporting. It extracts, transforms and loads data from operational data stores and data marts into a data warehouse and staging area to integrate and store large amounts of corporate data. Data mining analyzes large databases to extract previously unknown and potentially useful patterns and relationships to improve business processes.
A distributed database is a collection of logically interrelated databases distributed over a computer network. A distributed database management system (DDBMS) manages the distributed database and makes the distribution transparent to users. There are two main types of DDBMS - homogeneous and heterogeneous. Key characteristics of distributed databases include replication of fragments, shared logically related data across sites, and each site being controlled by a DBMS. Challenges include complex management, security, and increased storage requirements due to data replication.
Advance Database Management Systems -Object Oriented Principles In DatabaseSonali Parab
This document provides an overview of object-oriented database management systems (OODBMS), which combine object-oriented programming principles with database management. It discusses how OODBMSs support encapsulation, polymorphism, inheritance and ACID properties while allowing for complex objects, relationships, and queries of large amounts of data. The document also lists advantages and disadvantages of OODBMSs compared to relational database systems and examples of both proprietary and open-source OODBMSs.
Data preprocessing involves transforming raw data into an understandable and consistent format. It includes data cleaning, integration, transformation, and reduction. Data cleaning aims to fill missing values, smooth noise, and resolve inconsistencies. Data integration combines data from multiple sources. Data transformation handles tasks like normalization and aggregation to prepare the data for mining. Data reduction techniques obtain a reduced representation of data that maintains analytical results but reduces volume, such as through aggregation, dimensionality reduction, discretization, and sampling.
Outlier analysis is used to identify outliers, which are data objects that are inconsistent with the general behavior or model of the data. There are two main types of outlier detection - statistical distribution-based detection, which identifies outliers based on how far they are from the average statistical distribution, and distance-based detection, which finds outliers based on how far they are from other data objects. Outlier analysis is useful for tasks like fraud detection, where outliers may indicate fraudulent activity that is different from normal patterns in the data.
The document discusses different machine learning methods: supervised learning uses labeled training data to predict outputs, while unsupervised learning finds patterns without labels. Semi-supervised learning uses a small amount of labeled and large amount of unlabeled data. Reinforcement learning algorithms interact with an environment and receive rewards or errors to maximize performance over time. Supervised methods include support vector machines, while unsupervised methods include k-means clustering.
Web mining is the application of data mining techniques to extract knowledge from web data, including web content, structure, and usage data. Web content mining analyzes text, images, and other unstructured data on web pages using natural language processing and information retrieval. Web structure mining examines the hyperlinks between pages to discover relationships. Web usage mining applies data mining methods to server logs and other web data to discover patterns of user behavior on websites. Text mining aims to extract useful information from unstructured text documents using techniques like summarization, information extraction, categorization, and sentiment analysis.
The document discusses database backup and recovery basics. It defines redo log files and archived log files, with redo logs recording changes made to the database for recovery and archived logs copying redo log contents for recovery. It also covers the goals of database administrators to keep databases available, types of backups (physical and logical), categories of failures (media failures and user errors), configuring for recoverability including archive log files, and the differences between no archive log mode and archive log mode.
The document defines data mining as extracting useful information from large datasets. It discusses two main types of data mining tasks: descriptive tasks like frequent pattern mining and classification/prediction tasks like decision trees. Several data mining techniques are covered, including association, classification, clustering, prediction, sequential patterns, and decision trees. Real-world applications of data mining are also outlined, such as market basket analysis, fraud detection, healthcare, education, and CRM.
This document discusses association rule mining. Association rule mining finds frequent patterns, associations, correlations, or causal structures among items in transaction databases. The Apriori algorithm is commonly used to find frequent itemsets and generate association rules. It works by iteratively joining frequent itemsets from the previous pass to generate candidates, and then pruning the candidates that have infrequent subsets. Various techniques can improve the efficiency of Apriori, such as hashing to count itemsets and pruning transactions that don't contain frequent itemsets. Alternative approaches like FP-growth compress the database into a tree structure to avoid costly scans and candidate generation. The document also discusses mining multilevel, multidimensional, and quantitative association rules.
This document discusses inference in first-order logic. It defines sound and complete inference and introduces substitution. It then discusses propositional vs first-order inference and introduces universal and existential quantifiers. The key techniques of first-order inference are unification, which finds substitutions to make logical expressions identical, and forward chaining inference, which applies rules like modus ponens to iteratively derive new facts from a knowledge base.
This document provides an overview of IT infrastructure architecture and networking building blocks and concepts. It discusses the evolution from mainframe computers to local area networks and the internet. The key networking concepts covered include the OSI reference model, physical layer components like cables, patch panels and network interface cards, as well as datalink layer protocols like Ethernet and Wi-Fi.
The document discusses the key components of a database system environment: hardware, software, people, procedures, and data. It describes hardware as the physical devices like computers. It explains that software includes operating systems, database management systems (DBMS), and application programs. People in the environment include administrators, designers, analysts, programmers, and end users. Procedures govern how the database system is designed and used. Data refers to the collection of facts stored in the database.
The document discusses planning and problem solving in artificial intelligence. It describes planning problems as finding a sequence of actions to achieve a given goal state from an initial state. Common assumptions in planning include atomic time steps, deterministic actions, and a closed world. Blocks world examples are provided to illustrate planning domains and representations using states, goals, and operators. Classical planning approaches like STRIPS are summarized.
This document provides an overview of data mining techniques and concepts. It defines data mining as the process of discovering interesting patterns and knowledge from large amounts of data. The key steps involved are data cleaning, integration, selection, transformation, mining, evaluation, and presentation. Common data mining techniques include classification, clustering, association rule mining, and anomaly detection. The document also discusses data sources, major applications of data mining, and challenges.
Distributed systems allow independent computers to appear as a single coherent system by connecting them through a middleware layer. They provide advantages like increased reliability, scalability, and sharing of resources. Key goals of distributed systems include resource sharing, openness, transparency, and concurrency. Common types are distributed computing systems, distributed information systems, and distributed pervasive systems.
An knowledge based system (KBS) is a type of artificial intelligence program that uses a knowledge base to solve problems within a specialized domain that normally requires human expertise. A KBS consists of a knowledge base containing facts, rules, and heuristics about its domain, an inference engine that applies reasoning to the knowledge base, and a user interface. The knowledge base is developed by a knowledge engineer working with a domain expert to capture their expertise. A KBS can perform tasks like classification, diagnosis and planning by drawing on the captured knowledge through its inference engine.
The document discusses Internet protocols and IPTables filtering. It provides an overview of Internet protocols, IP addressing, firewall utilities, and the different types of IPTables - Filter, NAT, and Mangle tables. The Filter table is used for filtering packets. The NAT table is used for network address translation. The Mangle table is used for specialized packet alterations. IPTables works by defining rules within chains to allow or block network traffic based on packet criteria.
There are three main points about data streams and stream processing:
1) A data stream is a continuous, ordered sequence of data items that arrives too rapidly to be stored fully. Common sources include sensors, web traffic, and social media.
2) Data stream management systems process continuous queries over streams in real-time using bounded memory. They provide summaries of historical data rather than storing entire streams.
3) Challenges of stream processing include limited memory, complex continuous queries, and unpredictable data rates and characteristics. Approximate query processing techniques like windows, sampling, and load shedding help address these challenges.
This document discusses data preprocessing techniques for data mining. It explains that real-world data is often dirty, containing issues like missing values, noise, and inconsistencies. Major tasks in data preprocessing include data cleaning, integration, transformation, reduction, and discretization. Data cleaning techniques are especially important and involve filling in missing values, identifying and handling outliers, resolving inconsistencies, and reducing redundancy from data integration. Other techniques discussed include binning data for smoothing noisy values and handling missing data through various imputation methods.
The document provides an introduction to distributed systems, defining them as a collection of independent computers that communicate over a network to act as a single coherent system. It discusses the motivation for and characteristics of distributed systems, including concurrency, lack of a global clock, and independence of failures. Architectural categories of distributed systems include tightly coupled and loosely coupled, with examples given of different types of distributed systems such as database management systems, ATM networks, and the internet.
Data warehousing combines data from multiple sources into a single database to provide businesses with analytics results from data mining, OLAP, scorecarding and reporting. It extracts, transforms and loads data from operational data stores and data marts into a data warehouse and staging area to integrate and store large amounts of corporate data. Data mining analyzes large databases to extract previously unknown and potentially useful patterns and relationships to improve business processes.
A distributed database is a collection of logically interrelated databases distributed over a computer network. A distributed database management system (DDBMS) manages the distributed database and makes the distribution transparent to users. There are two main types of DDBMS - homogeneous and heterogeneous. Key characteristics of distributed databases include replication of fragments, shared logically related data across sites, and each site being controlled by a DBMS. Challenges include complex management, security, and increased storage requirements due to data replication.
Advance Database Management Systems -Object Oriented Principles In DatabaseSonali Parab
This document provides an overview of object-oriented database management systems (OODBMS), which combine object-oriented programming principles with database management. It discusses how OODBMSs support encapsulation, polymorphism, inheritance and ACID properties while allowing for complex objects, relationships, and queries of large amounts of data. The document also lists advantages and disadvantages of OODBMSs compared to relational database systems and examples of both proprietary and open-source OODBMSs.
Data preprocessing involves transforming raw data into an understandable and consistent format. It includes data cleaning, integration, transformation, and reduction. Data cleaning aims to fill missing values, smooth noise, and resolve inconsistencies. Data integration combines data from multiple sources. Data transformation handles tasks like normalization and aggregation to prepare the data for mining. Data reduction techniques obtain a reduced representation of data that maintains analytical results but reduces volume, such as through aggregation, dimensionality reduction, discretization, and sampling.
Outlier analysis is used to identify outliers, which are data objects that are inconsistent with the general behavior or model of the data. There are two main types of outlier detection - statistical distribution-based detection, which identifies outliers based on how far they are from the average statistical distribution, and distance-based detection, which finds outliers based on how far they are from other data objects. Outlier analysis is useful for tasks like fraud detection, where outliers may indicate fraudulent activity that is different from normal patterns in the data.
The document discusses different machine learning methods: supervised learning uses labeled training data to predict outputs, while unsupervised learning finds patterns without labels. Semi-supervised learning uses a small amount of labeled and large amount of unlabeled data. Reinforcement learning algorithms interact with an environment and receive rewards or errors to maximize performance over time. Supervised methods include support vector machines, while unsupervised methods include k-means clustering.
This document analyzes and compares different statistical and machine learning methods for software effort prediction, including linear regression, support vector machine, artificial neural network, decision tree, and bagging. The researchers tested these methods on a dataset of 499 software projects. Their results showed that the decision tree method produced more accurate effort predictions than the other methods tested, performing comparably to linear regression. The decision tree approach is therefore considered effective for software effort estimation.
This document discusses machine learning approaches including supervised learning, unsupervised learning, semi-supervised learning, reinforcement learning, and deep learning. Supervised learning uses labeled training data, unsupervised learning uses unlabeled data, and semi-supervised learning uses a mixture. Reinforcement learning involves an agent learning through trial-and-error interactions in an environment. Deep learning uses neural networks with many layers to analyze complex data like images and sound. Machine learning has applications in healthcare, finance, manufacturing, transportation, and other domains, but also faces challenges regarding data quality, overfitting, and interpretability.
This document discusses machine learning and artificial intelligence. It provides an overview of the machine learning process, including obtaining raw data, preprocessing the data, applying algorithms to extract features and train models, and generating outputs. It then describes different types of machine learning, including supervised learning, unsupervised learning, reinforcement learning, and semi-supervised learning. Specific algorithms like artificial neural networks, support vector machines, genetic algorithms are also briefly explained. Real-world applications of machine learning like character recognition and medical diagnosis are listed.
Supervised learning is a fundamental concept in machine learning, where a computer algorithm learns from labeled data to make predictions or decisions. It is a type of machine learning paradigm that involves training a model on a dataset where both the input data and the corresponding desired output (or target) are provided. The goal of supervised learning is to learn a mapping or relationship between inputs and outputs so that the model can make accurate predictions on new, unseen data.v
The document discusses different approaches to artificial intelligence, including rule-based and learning-based systems. It describes rule-based systems as using if-then rules to reach conclusions, while learning-based systems can adapt existing knowledge through learning. Machine learning is discussed as a type of learning-based AI that allows systems to learn from data without being explicitly programmed. Deep learning is described as a subset of machine learning that uses neural networks with multiple layers to learn from examples in a way similar to the human brain.
Machine learning builds prediction models by learning from previous data to predict the output of new data. It uses large amounts of data to build accurate models that improve automatically over time without being explicitly programmed. Machine learning detects patterns in data through supervised learning using labeled training data, unsupervised learning on unlabeled data to group similar objects, or reinforcement learning where an agent receives rewards or penalties to learn from feedback. It is widely used for problems like decision making, data mining, and finding hidden patterns.
Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention
Basics of machine learning. Fundamentals of machine learning. These slides are collected from different learning materials and organized into one slide set.
Machine learning is a form of artificial intelligence that allows systems to learn and improve automatically through experience without being explicitly programmed. There are several types of machine learning, including supervised learning (using labeled examples to predict outcomes), unsupervised learning (discovering hidden patterns in unlabeled data), and reinforcement learning (where an agent learns through trial-and-error interactions with an environment). Machine learning enables the analysis of massive amounts of data to identify opportunities or risks, though proper training is needed to ensure accurate and effective results.
Supervised learning is the types of machine learning in which machines are trained using well "labelled" training data, and on basis of that data, machines predict the output.
This document provides an overview of machine learning topics including linear regression, linear classification models, decision trees, random forests, supervised learning, unsupervised learning, reinforcement learning, and regression analysis. It defines machine learning, describes how machines learn through training, validation and application phases, and lists applications of machine learning such as risk assessment and fraud detection. It also explains key machine learning algorithms and techniques including linear regression, naive bayes, support vector machines, decision trees, gradient descent, least squares, multiple linear regression, bayesian linear regression, and types of machine learning models.
The fifth talk at Process Mining Camp was given by Olga Gazina and Daniel Cathala from Euroclear. As a data analyst at the internal audit department Olga helped Daniel, IT Manager, to make his life at the end of the year a bit easier by using process mining to identify key risks.
She applied process mining to the process from development to release at the Component and Data Management IT division. It looks like a simple process at first, but Daniel explains that it becomes increasingly complex when considering that multiple configurations and versions are developed, tested and released. It becomes even more complex as the projects affecting these releases are running in parallel. And on top of that, each project often impacts multiple versions and releases.
After Olga obtained the data for this process, she quickly realized that she had many candidates for the caseID, timestamp and activity. She had to find a perspective of the process that was on the right level, so that it could be recognized by the process owners. In her talk she takes us through her journey step by step and shows the challenges she encountered in each iteration. In the end, she was able to find the visualization that was hidden in the minds of the business experts.
Ann Naser Nabil- Data Scientist Portfolio.pdfআন্ নাসের নাবিল
I am a data scientist with a strong foundation in economics and a deep passion for AI-driven problem-solving. My academic journey includes a B.Sc. in Economics from Jahangirnagar University and a year of Physics study at Shahjalal University of Science and Technology, providing me with a solid interdisciplinary background and a sharp analytical mindset.
I have practical experience in developing and deploying machine learning and deep learning models across a range of real-world applications. Key projects include:
AI-Powered Disease Prediction & Drug Recommendation System – Deployed on Render, delivering real-time health insights through predictive analytics.
Mood-Based Movie Recommendation Engine – Uses genre preferences, sentiment, and user behavior to generate personalized film suggestions.
Medical Image Segmentation with GANs (Ongoing) – Developing generative adversarial models for cancer and tumor detection in radiology.
In addition, I have developed three Python packages focused on:
Data Visualization
Preprocessing Pipelines
Automated Benchmarking of Machine Learning Models
My technical toolkit includes Python, NumPy, Pandas, Scikit-learn, TensorFlow, Keras, Matplotlib, and Seaborn. I am also proficient in feature engineering, model optimization, and storytelling with data.
Beyond data science, my background as a freelance writer for Earki and Prothom Alo has refined my ability to communicate complex technical ideas to diverse audiences.
The third speaker at Process Mining Camp 2018 was Dinesh Das from Microsoft. Dinesh Das is the Data Science manager in Microsoft’s Core Services Engineering and Operations organization.
Machine learning and cognitive solutions give opportunities to reimagine digital processes every day. This goes beyond translating the process mining insights into improvements and into controlling the processes in real-time and being able to act on this with advanced analytics on future scenarios.
Dinesh sees process mining as a silver bullet to achieve this and he shared his learnings and experiences based on the proof of concept on the global trade process. This process from order to delivery is a collaboration between Microsoft and the distribution partners in the supply chain. Data of each transaction was captured and process mining was applied to understand the process and capture the business rules (for example setting the benchmark for the service level agreement). These business rules can then be operationalized as continuous measure fulfillment and create triggers to act using machine learning and AI.
Using the process mining insight, the main variants are translated into Visio process maps for monitoring. The tracking of the performance of this process happens in real-time to see when cases become too late. The next step is to predict in what situations cases are too late and to find alternative routes.
As an example, Dinesh showed how machine learning could be used in this scenario. A TradeChatBot was developed based on machine learning to answer questions about the process. Dinesh showed a demo of the bot that was able to answer questions about the process by chat interactions. For example: “Which cases need to be handled today or require special care as they are expected to be too late?”. In addition to the insights from the monitoring business rules, the bot was also able to answer questions about the expected sequences of particular cases. In order for the bot to answer these questions, the result of the process mining analysis was used as a basis for machine learning.
保密服务多伦多都会大学英文毕业证书影本加拿大成绩单多伦多都会大学文凭【q微1954292140】办理多伦多都会大学学位证(TMU毕业证书)成绩单VOID底纹防伪【q微1954292140】帮您解决在加拿大多伦多都会大学未毕业难题(Toronto Metropolitan University)文凭购买、毕业证购买、大学文凭购买、大学毕业证购买、买文凭、日韩文凭、英国大学文凭、美国大学文凭、澳洲大学文凭、加拿大大学文凭(q微1954292140)新加坡大学文凭、新西兰大学文凭、爱尔兰文凭、西班牙文凭、德国文凭、教育部认证,买毕业证,毕业证购买,买大学文凭,购买日韩毕业证、英国大学毕业证、美国大学毕业证、澳洲大学毕业证、加拿大大学毕业证(q微1954292140)新加坡大学毕业证、新西兰大学毕业证、爱尔兰毕业证、西班牙毕业证、德国毕业证,回国证明,留信网认证,留信认证办理,学历认证。从而完成就业。多伦多都会大学毕业证办理,多伦多都会大学文凭办理,多伦多都会大学成绩单办理和真实留信认证、留服认证、多伦多都会大学学历认证。学院文凭定制,多伦多都会大学原版文凭补办,扫描件文凭定做,100%文凭复刻。
特殊原因导致无法毕业,也可以联系我们帮您办理相关材料:
1:在多伦多都会大学挂科了,不想读了,成绩不理想怎么办???
2:打算回国了,找工作的时候,需要提供认证《TMU成绩单购买办理多伦多都会大学毕业证书范本》【Q/WeChat:1954292140】Buy Toronto Metropolitan University Diploma《正式成绩单论文没过》有文凭却得不到认证。又该怎么办???加拿大毕业证购买,加拿大文凭购买,【q微1954292140】加拿大文凭购买,加拿大文凭定制,加拿大文凭补办。专业在线定制加拿大大学文凭,定做加拿大本科文凭,【q微1954292140】复制加拿大Toronto Metropolitan University completion letter。在线快速补办加拿大本科毕业证、硕士文凭证书,购买加拿大学位证、多伦多都会大学Offer,加拿大大学文凭在线购买。
加拿大文凭多伦多都会大学成绩单,TMU毕业证【q微1954292140】办理加拿大多伦多都会大学毕业证(TMU毕业证书)【q微1954292140】学位证书电子图在线定制服务多伦多都会大学offer/学位证offer办理、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作。帮你解决多伦多都会大学学历学位认证难题。
主营项目:
1、真实教育部国外学历学位认证《加拿大毕业文凭证书快速办理多伦多都会大学毕业证书不见了怎么办》【q微1954292140】《论文没过多伦多都会大学正式成绩单》,教育部存档,教育部留服网站100%可查.
2、办理TMU毕业证,改成绩单《TMU毕业证明办理多伦多都会大学学历认证定制》【Q/WeChat:1954292140】Buy Toronto Metropolitan University Certificates《正式成绩单论文没过》,多伦多都会大学Offer、在读证明、学生卡、信封、证明信等全套材料,从防伪到印刷,从水印到钢印烫金,高精仿度跟学校原版100%相同.
3、真实使馆认证(即留学人员回国证明),使馆存档可通过大使馆查询确认.
4、留信网认证,国家专业人才认证中心颁发入库证书,留信网存档可查.
《多伦多都会大学学位证购买加拿大毕业证书办理TMU假学历认证》【q微1954292140】学位证1:1完美还原海外各大学毕业材料上的工艺:水印,阴影底纹,钢印LOGO烫金烫银,LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。
高仿真还原加拿大文凭证书和外壳,定制加拿大多伦多都会大学成绩单和信封。学历认证证书电子版TMU毕业证【q微1954292140】办理加拿大多伦多都会大学毕业证(TMU毕业证书)【q微1954292140】毕业证书样本多伦多都会大学offer/学位证学历本科证书、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作。帮你解决多伦多都会大学学历学位认证难题。
多伦多都会大学offer/学位证、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作【q微1954292140】Buy Toronto Metropolitan University Diploma购买美国毕业证,购买英国毕业证,购买澳洲毕业证,购买加拿大毕业证,以及德国毕业证,购买法国毕业证(q微1954292140)购买荷兰毕业证、购买瑞士毕业证、购买日本毕业证、购买韩国毕业证、购买新西兰毕业证、购买新加坡毕业证、购买西班牙毕业证、购买马来西亚毕业证等。包括了本科毕业证,硕士毕业证。
Niyi started with process mining on a cold winter morning in January 2017, when he received an email from a colleague telling him about process mining. In his talk, he shared his process mining journey and the five lessons they have learned so far.
保密服务圣地亚哥州立大学英文毕业证书影本美国成绩单圣地亚哥州立大学文凭【q微1954292140】办理圣地亚哥州立大学学位证(SDSU毕业证书)毕业证书购买【q微1954292140】帮您解决在美国圣地亚哥州立大学未毕业难题(San Diego State University)文凭购买、毕业证购买、大学文凭购买、大学毕业证购买、买文凭、日韩文凭、英国大学文凭、美国大学文凭、澳洲大学文凭、加拿大大学文凭(q微1954292140)新加坡大学文凭、新西兰大学文凭、爱尔兰文凭、西班牙文凭、德国文凭、教育部认证,买毕业证,毕业证购买,买大学文凭,购买日韩毕业证、英国大学毕业证、美国大学毕业证、澳洲大学毕业证、加拿大大学毕业证(q微1954292140)新加坡大学毕业证、新西兰大学毕业证、爱尔兰毕业证、西班牙毕业证、德国毕业证,回国证明,留信网认证,留信认证办理,学历认证。从而完成就业。圣地亚哥州立大学毕业证办理,圣地亚哥州立大学文凭办理,圣地亚哥州立大学成绩单办理和真实留信认证、留服认证、圣地亚哥州立大学学历认证。学院文凭定制,圣地亚哥州立大学原版文凭补办,扫描件文凭定做,100%文凭复刻。
特殊原因导致无法毕业,也可以联系我们帮您办理相关材料:
1:在圣地亚哥州立大学挂科了,不想读了,成绩不理想怎么办???
2:打算回国了,找工作的时候,需要提供认证《SDSU成绩单购买办理圣地亚哥州立大学毕业证书范本》【Q/WeChat:1954292140】Buy San Diego State University Diploma《正式成绩单论文没过》有文凭却得不到认证。又该怎么办???美国毕业证购买,美国文凭购买,【q微1954292140】美国文凭购买,美国文凭定制,美国文凭补办。专业在线定制美国大学文凭,定做美国本科文凭,【q微1954292140】复制美国San Diego State University completion letter。在线快速补办美国本科毕业证、硕士文凭证书,购买美国学位证、圣地亚哥州立大学Offer,美国大学文凭在线购买。
美国文凭圣地亚哥州立大学成绩单,SDSU毕业证【q微1954292140】办理美国圣地亚哥州立大学毕业证(SDSU毕业证书)【q微1954292140】录取通知书offer在线制作圣地亚哥州立大学offer/学位证毕业证书样本、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作。帮你解决圣地亚哥州立大学学历学位认证难题。
主营项目:
1、真实教育部国外学历学位认证《美国毕业文凭证书快速办理圣地亚哥州立大学办留服认证》【q微1954292140】《论文没过圣地亚哥州立大学正式成绩单》,教育部存档,教育部留服网站100%可查.
2、办理SDSU毕业证,改成绩单《SDSU毕业证明办理圣地亚哥州立大学成绩单购买》【Q/WeChat:1954292140】Buy San Diego State University Certificates《正式成绩单论文没过》,圣地亚哥州立大学Offer、在读证明、学生卡、信封、证明信等全套材料,从防伪到印刷,从水印到钢印烫金,高精仿度跟学校原版100%相同.
3、真实使馆认证(即留学人员回国证明),使馆存档可通过大使馆查询确认.
4、留信网认证,国家专业人才认证中心颁发入库证书,留信网存档可查.
《圣地亚哥州立大学学位证书的英文美国毕业证书办理SDSU办理学历认证书》【q微1954292140】学位证1:1完美还原海外各大学毕业材料上的工艺:水印,阴影底纹,钢印LOGO烫金烫银,LOGO烫金烫银复合重叠。文字图案浮雕、激光镭射、紫外荧光、温感、复印防伪等防伪工艺。
高仿真还原美国文凭证书和外壳,定制美国圣地亚哥州立大学成绩单和信封。毕业证网上可查学历信息SDSU毕业证【q微1954292140】办理美国圣地亚哥州立大学毕业证(SDSU毕业证书)【q微1954292140】学历认证生成授权声明圣地亚哥州立大学offer/学位证文凭购买、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作。帮你解决圣地亚哥州立大学学历学位认证难题。
圣地亚哥州立大学offer/学位证、留信官方学历认证(永久存档真实可查)采用学校原版纸张、特殊工艺完全按照原版一比一制作【q微1954292140】Buy San Diego State University Diploma购买美国毕业证,购买英国毕业证,购买澳洲毕业证,购买加拿大毕业证,以及德国毕业证,购买法国毕业证(q微1954292140)购买荷兰毕业证、购买瑞士毕业证、购买日本毕业证、购买韩国毕业证、购买新西兰毕业证、购买新加坡毕业证、购买西班牙毕业证、购买马来西亚毕业证等。包括了本科毕业证,硕士毕业证。
The fourth speaker at Process Mining Camp 2018 was Wim Kouwenhoven from the City of Amsterdam. Amsterdam is well-known as the capital of the Netherlands and the City of Amsterdam is the municipality defining and governing local policies. Wim is a program manager responsible for improving and controlling the financial function.
A new way of doing things requires a different approach. While introducing process mining they used a five-step approach:
Step 1: Awareness
Introducing process mining is a little bit different in every organization. You need to fit something new to the context, or even create the context. At the City of Amsterdam, the key stakeholders in the financial and process improvement department were invited to join a workshop to learn what process mining is and to discuss what it could do for Amsterdam.
Step 2: Learn
As Wim put it, at the City of Amsterdam they are very good at thinking about something and creating plans, thinking about it a bit more, and then redesigning the plan and talking about it a bit more. So, they deliberately created a very small plan to quickly start experimenting with process mining in small pilot. The scope of the initial project was to analyze the Purchase-to-Pay process for one department covering four teams. As a result, they were able show that they were able to answer five key questions and got appetite for more.
Step 3: Plan
During the learning phase they only planned for the goals and approach of the pilot, without carving the objectives for the whole organization in stone. As the appetite was growing, more stakeholders were involved to plan for a broader adoption of process mining. While there was interest in process mining in the broader organization, they decided to keep focusing on making process mining a success in their financial department.
Step 4: Act
After the planning they started to strengthen the commitment. The director for the financial department took ownership and created time and support for the employees, team leaders, managers and directors. They started to develop the process mining capability by organizing training sessions for the teams and internal audit. After the training, they applied process mining in practice by deepening their analysis of the pilot by looking at e-invoicing, deleted invoices, analyzing the process by supplier, looking at new opportunities for audit, etc. As a result, the lead time for invoices was decreased by 8 days by preventing rework and by making the approval process more efficient. Even more important, they could further strengthen the commitment by convincing the stakeholders of the value.
Step 5: Act again
After convincing the stakeholders of the value you need to consolidate the success by acting again. Therefore, a team of process mining analysts was created to be able to meet the demand and sustain the success. Furthermore, new experiments were started to see how process mining could be used in three audits in 2018.
Dr. Robert Krug - Expert In Artificial IntelligenceDr. Robert Krug
Dr. Robert Krug is a New York-based expert in artificial intelligence, with a Ph.D. in Computer Science from Columbia University. He serves as Chief Data Scientist at DataInnovate Solutions, where his work focuses on applying machine learning models to improve business performance and strengthen cybersecurity measures. With over 15 years of experience, Robert has a track record of delivering impactful results. Away from his professional endeavors, Robert enjoys the strategic thinking of chess and urban photography.
indonesia-gen-z-report-2024 Gen Z (born between 1997 and 2012) is currently t...disnakertransjabarda
Gen Z (born between 1997 and 2012) is currently the biggest generation group in Indonesia with 27.94% of the total population or. 74.93 million people.
AI ------------------------------ W1L2.pptxAyeshaJalil6
This lecture provides a foundational understanding of Artificial Intelligence (AI), exploring its history, core concepts, and real-world applications. Students will learn about intelligent agents, machine learning, neural networks, natural language processing, and robotics. The lecture also covers ethical concerns and the future impact of AI on various industries. Designed for beginners, it uses simple language, engaging examples, and interactive discussions to make AI concepts accessible and exciting.
By the end of this lecture, students will have a clear understanding of what AI is, how it works, and where it's headed.
1. LEARNING
METHODSG R O U P M E M B E R S
M . I S H A Q Z A M A N
A R S L A N N A Z I R
B I L A L L AT I F
2. WHAT IS LEARNING?
• Process of learning begins with observation of data such as, examples,
direct experience or instructions.
• Its aim is to allow the computers learn automatically without human
intervention or assistance and adjust actions accordingly.
4. SUPERVISED MACHINE LEARNING
• Can apply what has been learned in the past to new data using labeled
examples to predict future events.
• Starting from the analysis of a known training dataset, the learning
algorithm produces an inferred function to make predictions about the
output values.
• Algorithm analyzes the training data set and produces an inferred
function.
• If the output of the function is discrete than it is called classifier and if
the output is continuous than it is called a regression function.
5. EXAMPLE
• If the inputs are 1,2,3,4,5,6 and the outputs according to the inputs are
1,4,9,16,25,36
• Then we can predict the next output by the help of function which we
get from above which is output=input^2
• So if the next input is 7 than by putting in function the output will be 49
7. UNSUPERVISED MACHINE LEARNING
• No labels are given to the learning algorithm, leaving it on its own to
find structure in its inputs
• Unsupervised learning can be a goal in itself (discovering hidden
patterns in data).
• The data have no target attribute.
8. EXAMPLE
• You have bunch of photos of 6 people but without information who is
on which one and want to divide this dataset into 6 piles, each with
photos of one individual.
10. SEMI-SUPERVISED LEARNING
• Semi-supervised learning falls in between Supervised and Unsupervised.
• Semi-supervised learning use small amount of labeled data and large
amount of unlabeled data.
• The goal is to learn a predictor that predicts future test data better than
the predictor learned from the labeled training data alone.
• This for example can be used in Deep belief networks, where some
layers are learning the structure of the data (unsupervised) and one
layer is used to make the classification (trained with supervised data)
11. REINFORCEMENT MACHINE LEARNING
ALGORITHMS
• Is a learning method that interacts with its environment by producing actions and
discovers errors or rewards.
• Reinforcement learning algorithm (called the agent) continuously learns from the
environment in an iterative fashion. In the process, the agent learns from its
experiences of the environment until it explores the full range of possible states.
• This method allows machines and software agents to automatically determine the
behavior within a specific context in order to maximize its performance
12. STEPS REINFORCEMENT ALGO WORK..
• In order to produce intelligent programs (also called agents), reinforcement learning
goes through the following steps:
• Input state is observed by the agent.
• Decision making function is used to make the agent perform an action.
• After the action is performed, the agent receives reward or reinforcement from the
environment.
• The state-action pair information about the reward is stored.