SlideShare a Scribd company logo
Getting Data Cloud Ready at Scale
Foundational Element in Supporting University and Campus Goals
1
George Mansoor, Chief Information Systems Officer
Agenda
• A Focus On Why We Are Here
• Modernizing Strategically
• The Path Forward
2
A focus on Students
Graduation Initiative 2025: Launched in 2015, it is an
ambitious plan to increase graduation rates, eliminate
completion and meet California’s workforce needs.
Being strategic: Each campus and the Chancellors office
uses this vision as the litmus test when spending scarce
strategic initiatives.
3
Background on our ERP system
• PeopleSoft ERP supporting our Finance,
Student and HR data since 2001
• 23 Campus Solutions
• 23 Human Resources, migrating to Single
HR starting 2022
• Single Finance since 2011
• 23 campuses, and eight off-campus
centers
• Support to the largest public higher
education institution - 485,500 students
with 56,000 faculty and staff
4
Challenges of Accessing ERP Data
• Students and faculty need improved engagement and expect a stellar
and seamless experience
• Data from the ERP provides critical information about our students
(financial aid, enrollment, graduation requirements, hiring, etc.)
Campuses need data from our ERP
• Current PeopleSoft ERP is highly customized making integrations
costly and time consuming.
• ERP is aging and we have grown!
• 2001: 370,000 students with 40,000 faculty and staff
• Today: 485,000 students with 56,000 faculty and staff
• To accommodate ERP gaps we have an assortment of systems –
Especially true in the Student Information Systems where student
engagement systems have grown.
• Limited resources
5
Overview
• Objective: CSU was looking to modernize how we process data and wanted
to get our data “cloud ready” and “report ready” to take advantage of new
capabilities that the cloud offered.
• Problem: CSU data resides in on-prem legacy ERP (PeopleSoft/Oracle)
systems
Cloud Ready Repository (CMS Data Lake)
• Repository of raw copies of source system data
• Today that is largely CMS data sources with plans to include other sources
over time (CSULearn/SumTotal, Person Data Management (PDM))
• Data is stored in Apache Parquet format
• Intended to facilitate the use of cloud-based data tools using CMS data
• Built to support the CSU Data Lake
7
Our Data Lakes
Report Ready Repository (CSU Data Lake)
• Fairly comprehensive data and reporting solution
• Uses source data and transforms data into rich data collections
• Data collections targeted towards reporting
8
Our Data Lakes
© 2017 Unisys Corporation. All rights reserved. 9
ON-PREM
PROD
AWS
CMS Data Lake - Data as a Service
CMS Data Lake CSU Data Lake
VPC Peering
Students
Students by Term
Students by Class
Students by Degree
Classes by Section
Applications by Applicant
First Big Problem
• Our biggest source of data is our ERP. They are on-prem and will stay there.
• 47 production instances
• Approx 1TB aggregate of “interesting” data.
• Oracle RDBMS
• OLTP Optimized
How do we get this to the cloud?
Delphix Data Virtualization Platform
Data virtualization decouples the database layer that sits between the storage and
application layers in the application stack. Just like a hypervisor sits between the server
and the OS to create a virtual server, database virtualization software sits between the
database and the OS to abstract/virtualize the data store resources. Because database
resources are virtualized, they require a much smaller storage footprint than the source
database. Instead of making and moving new blocks of data, virtual data (virtual data
copies) use pointers to data blocks, providing high-performance access to data already in
place.
11
12
AWS Database Migration Service (DMS)
AWS Database Migration Service (AWS DMS) helps you migrate databases
to AWS quickly and securely. The source database remains fully operational
during the migration, minimizing downtime to applications that rely on the
database. The AWS Database Migration Service can migrate your data to
and from the most widely used commercial and open-source databases.
13
© 2017 Unisys Corporation. All rights reserved. 14
ON-PREMISE
PRODUCTION
CS
HR
FIN
Oracle DB on Ec2
Campus Specific S3 buckets
Continues
Replication over
Secure VPN
• Data Gets Copied in to Encrypted S3
buckets
• Historical Data is stored in date wise S3
bucket folders
AWS CLOUD
DMS Instances
DMS Tasks S3 Copy
CMS Data Lake - Data as a Service
15
Questions
16
Ad

More Related Content

Similar to Data Con LA 2022 - Moving Data at Scale to AWS (20)

Crafting highly scalable and performant Modern Data Platforms
Crafting highly scalable and performant Modern Data PlatformsCrafting highly scalable and performant Modern Data Platforms
Crafting highly scalable and performant Modern Data Platforms
Sameer Paradkar
 
Introduction to Database system
Introduction to Database systemIntroduction to Database system
Introduction to Database system
Putu Sundika
 
Managing Large Amounts of Data with Salesforce
Managing Large Amounts of Data with SalesforceManaging Large Amounts of Data with Salesforce
Managing Large Amounts of Data with Salesforce
Sense Corp
 
Oracle Essbase in the Cloud A Mercer Advisors Success Story
Oracle Essbase in the Cloud A Mercer Advisors Success StoryOracle Essbase in the Cloud A Mercer Advisors Success Story
Oracle Essbase in the Cloud A Mercer Advisors Success Story
Perficient, Inc.
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR Technologies
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
Cloudera, Inc.
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Denodo
 
Cloud computing for Teachers and Students
Cloud computing for Teachers and StudentsCloud computing for Teachers and Students
Cloud computing for Teachers and Students
Mukesh Tekwani
 
HIKE project presentation
HIKE project presentationHIKE project presentation
HIKE project presentation
Ben Showers
 
Ghulam_DBA_CV
Ghulam_DBA_CVGhulam_DBA_CV
Ghulam_DBA_CV
abbas_khan
 
Dbms_class _14
Dbms_class _14Dbms_class _14
Dbms_class _14
sushantbit04
 
Slide Share MDW Modern Data Warehouse DWH
Slide Share MDW Modern Data Warehouse DWHSlide Share MDW Modern Data Warehouse DWH
Slide Share MDW Modern Data Warehouse DWH
MahmoudTalaat52
 
dbms introduction.pptx
dbms introduction.pptxdbms introduction.pptx
dbms introduction.pptx
ATISHAYJAIN847270
 
Introduction to Database Systems, File System vs DBMS,
Introduction to Database Systems, File System vs DBMS,Introduction to Database Systems, File System vs DBMS,
Introduction to Database Systems, File System vs DBMS,
agrawalmonikacomp
 
Fast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud ServiceFast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud Service
Gustavo Rene Antunez
 
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
Perficient
 
iFCloud Secure File Sharing
iFCloud Secure File SharingiFCloud Secure File Sharing
iFCloud Secure File Sharing
ifcloudus
 
Database Management System Lecture SlideCh-1.ppt
Database Management System Lecture SlideCh-1.pptDatabase Management System Lecture SlideCh-1.ppt
Database Management System Lecture SlideCh-1.ppt
StrickerMan
 
BUILDING A DATA WAREHOUSE
BUILDING A DATA WAREHOUSEBUILDING A DATA WAREHOUSE
BUILDING A DATA WAREHOUSE
Neha Kapoor
 
2016 Building Bridges - Need for a Data Management Strategy
2016 Building Bridges - Need for a Data Management Strategy2016 Building Bridges - Need for a Data Management Strategy
2016 Building Bridges - Need for a Data Management Strategy
Brad Bronsch
 
Crafting highly scalable and performant Modern Data Platforms
Crafting highly scalable and performant Modern Data PlatformsCrafting highly scalable and performant Modern Data Platforms
Crafting highly scalable and performant Modern Data Platforms
Sameer Paradkar
 
Introduction to Database system
Introduction to Database systemIntroduction to Database system
Introduction to Database system
Putu Sundika
 
Managing Large Amounts of Data with Salesforce
Managing Large Amounts of Data with SalesforceManaging Large Amounts of Data with Salesforce
Managing Large Amounts of Data with Salesforce
Sense Corp
 
Oracle Essbase in the Cloud A Mercer Advisors Success Story
Oracle Essbase in the Cloud A Mercer Advisors Success StoryOracle Essbase in the Cloud A Mercer Advisors Success Story
Oracle Essbase in the Cloud A Mercer Advisors Success Story
Perficient, Inc.
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR Technologies
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
Cloudera, Inc.
 
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)Building a Logical Data Fabric using Data Virtualization (ASEAN)
Building a Logical Data Fabric using Data Virtualization (ASEAN)
Denodo
 
Cloud computing for Teachers and Students
Cloud computing for Teachers and StudentsCloud computing for Teachers and Students
Cloud computing for Teachers and Students
Mukesh Tekwani
 
HIKE project presentation
HIKE project presentationHIKE project presentation
HIKE project presentation
Ben Showers
 
Slide Share MDW Modern Data Warehouse DWH
Slide Share MDW Modern Data Warehouse DWHSlide Share MDW Modern Data Warehouse DWH
Slide Share MDW Modern Data Warehouse DWH
MahmoudTalaat52
 
Introduction to Database Systems, File System vs DBMS,
Introduction to Database Systems, File System vs DBMS,Introduction to Database Systems, File System vs DBMS,
Introduction to Database Systems, File System vs DBMS,
agrawalmonikacomp
 
Fast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud ServiceFast, Flexible Application Development with Oracle Database Cloud Service
Fast, Flexible Application Development with Oracle Database Cloud Service
Gustavo Rene Antunez
 
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
How to Rapidly Configure Oracle Life Sciences Data Hub (LSH) to Support the M...
Perficient
 
iFCloud Secure File Sharing
iFCloud Secure File SharingiFCloud Secure File Sharing
iFCloud Secure File Sharing
ifcloudus
 
Database Management System Lecture SlideCh-1.ppt
Database Management System Lecture SlideCh-1.pptDatabase Management System Lecture SlideCh-1.ppt
Database Management System Lecture SlideCh-1.ppt
StrickerMan
 
BUILDING A DATA WAREHOUSE
BUILDING A DATA WAREHOUSEBUILDING A DATA WAREHOUSE
BUILDING A DATA WAREHOUSE
Neha Kapoor
 
2016 Building Bridges - Need for a Data Management Strategy
2016 Building Bridges - Need for a Data Management Strategy2016 Building Bridges - Need for a Data Management Strategy
2016 Building Bridges - Need for a Data Management Strategy
Brad Bronsch
 

More from Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
Data Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
Data Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
Data Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA
 
Data Con LA 2022 - Building Field-level Lineage from Scratch for Modern Data ...
Data Con LA 2022 - Building Field-level Lineage from Scratch for Modern Data ...Data Con LA 2022 - Building Field-level Lineage from Scratch for Modern Data ...
Data Con LA 2022 - Building Field-level Lineage from Scratch for Modern Data ...
Data Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
Data Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
Data Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
Data Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
Data Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
Data Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA
 
Data Con LA 2022 - Building Field-level Lineage from Scratch for Modern Data ...
Data Con LA 2022 - Building Field-level Lineage from Scratch for Modern Data ...Data Con LA 2022 - Building Field-level Lineage from Scratch for Modern Data ...
Data Con LA 2022 - Building Field-level Lineage from Scratch for Modern Data ...
Data Con LA
 
Ad

Recently uploaded (20)

Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfjOral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
maitripatel5301
 
Process Mining at Deutsche Bank - Journey
Process Mining at Deutsche Bank - JourneyProcess Mining at Deutsche Bank - Journey
Process Mining at Deutsche Bank - Journey
Process mining Evangelist
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
定制学历(美国Purdue毕业证)普渡大学电子版毕业证
定制学历(美国Purdue毕业证)普渡大学电子版毕业证定制学历(美国Purdue毕业证)普渡大学电子版毕业证
定制学历(美国Purdue毕业证)普渡大学电子版毕业证
Taqyea
 
Automation Platforms and Process Mining - success story
Automation Platforms and Process Mining - success storyAutomation Platforms and Process Mining - success story
Automation Platforms and Process Mining - success story
Process mining Evangelist
 
RAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit FrameworkRAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit Framework
apanneer
 
What is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdfWhat is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdf
SaikatBasu37
 
Transforming health care with ai powered
Transforming health care with ai poweredTransforming health care with ai powered
Transforming health care with ai powered
gowthamarvj
 
Mining a Global Trade Process with Data Science - Microsoft
Mining a Global Trade Process with Data Science - MicrosoftMining a Global Trade Process with Data Science - Microsoft
Mining a Global Trade Process with Data Science - Microsoft
Process mining Evangelist
 
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
Taqyea
 
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdfTOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
NhiV747372
 
Process Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBSProcess Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBS
Process mining Evangelist
 
problem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursingproblem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursing
vishnudathas123
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Process Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulenProcess Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulen
Process mining Evangelist
 
Controlling Financial Processes at a Municipality
Controlling Financial Processes at a MunicipalityControlling Financial Processes at a Municipality
Controlling Financial Processes at a Municipality
Process mining Evangelist
 
4. Multivariable statistics_Using Stata_2025.pdf
4. Multivariable statistics_Using Stata_2025.pdf4. Multivariable statistics_Using Stata_2025.pdf
4. Multivariable statistics_Using Stata_2025.pdf
axonneurologycenter1
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
Understanding Complex Development Processes
Understanding Complex Development ProcessesUnderstanding Complex Development Processes
Understanding Complex Development Processes
Process mining Evangelist
 
real illuminati Uganda agent 0782561496/0756664682
real illuminati Uganda agent 0782561496/0756664682real illuminati Uganda agent 0782561496/0756664682
real illuminati Uganda agent 0782561496/0756664682
way to join real illuminati Agent In Kampala Call/WhatsApp+256782561496/0756664682
 
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfjOral Malodor.pptx jsjshdhushehsidjjeiejdhfj
Oral Malodor.pptx jsjshdhushehsidjjeiejdhfj
maitripatel5301
 
hersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distributionhersh's midterm project.pdf music retail and distribution
hersh's midterm project.pdf music retail and distribution
hershtara1
 
定制学历(美国Purdue毕业证)普渡大学电子版毕业证
定制学历(美国Purdue毕业证)普渡大学电子版毕业证定制学历(美国Purdue毕业证)普渡大学电子版毕业证
定制学历(美国Purdue毕业证)普渡大学电子版毕业证
Taqyea
 
Automation Platforms and Process Mining - success story
Automation Platforms and Process Mining - success storyAutomation Platforms and Process Mining - success story
Automation Platforms and Process Mining - success story
Process mining Evangelist
 
RAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit FrameworkRAG Chatbot using AWS Bedrock and Streamlit Framework
RAG Chatbot using AWS Bedrock and Streamlit Framework
apanneer
 
What is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdfWhat is ETL? Difference between ETL and ELT?.pdf
What is ETL? Difference between ETL and ELT?.pdf
SaikatBasu37
 
Transforming health care with ai powered
Transforming health care with ai poweredTransforming health care with ai powered
Transforming health care with ai powered
gowthamarvj
 
Mining a Global Trade Process with Data Science - Microsoft
Mining a Global Trade Process with Data Science - MicrosoftMining a Global Trade Process with Data Science - Microsoft
Mining a Global Trade Process with Data Science - Microsoft
Process mining Evangelist
 
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
定制(意大利Rimini毕业证)布鲁诺马代尔纳嘉雷迪米音乐学院学历认证
Taqyea
 
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdfTOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
TOAE201-Slides-Chapter 4. Sample theoretical basis (1).pdf
NhiV747372
 
Process Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBSProcess Mining and Official Statistics - CBS
Process Mining and Official Statistics - CBS
Process mining Evangelist
 
problem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursingproblem solving.presentation slideshow bsc nursing
problem solving.presentation slideshow bsc nursing
vishnudathas123
 
50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd50_questions_full.pptxdddddddddddddddddd
50_questions_full.pptxdddddddddddddddddd
emir73065
 
Process Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulenProcess Mining at Dimension Data - Jan vermeulen
Process Mining at Dimension Data - Jan vermeulen
Process mining Evangelist
 
Controlling Financial Processes at a Municipality
Controlling Financial Processes at a MunicipalityControlling Financial Processes at a Municipality
Controlling Financial Processes at a Municipality
Process mining Evangelist
 
4. Multivariable statistics_Using Stata_2025.pdf
4. Multivariable statistics_Using Stata_2025.pdf4. Multivariable statistics_Using Stata_2025.pdf
4. Multivariable statistics_Using Stata_2025.pdf
axonneurologycenter1
 
HershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistributionHershAggregator (2).pdf musicretaildistribution
HershAggregator (2).pdf musicretaildistribution
hershtara1
 
Ad

Data Con LA 2022 - Moving Data at Scale to AWS

  • 1. Getting Data Cloud Ready at Scale Foundational Element in Supporting University and Campus Goals 1 George Mansoor, Chief Information Systems Officer
  • 2. Agenda • A Focus On Why We Are Here • Modernizing Strategically • The Path Forward 2
  • 3. A focus on Students Graduation Initiative 2025: Launched in 2015, it is an ambitious plan to increase graduation rates, eliminate completion and meet California’s workforce needs. Being strategic: Each campus and the Chancellors office uses this vision as the litmus test when spending scarce strategic initiatives. 3
  • 4. Background on our ERP system • PeopleSoft ERP supporting our Finance, Student and HR data since 2001 • 23 Campus Solutions • 23 Human Resources, migrating to Single HR starting 2022 • Single Finance since 2011 • 23 campuses, and eight off-campus centers • Support to the largest public higher education institution - 485,500 students with 56,000 faculty and staff 4
  • 5. Challenges of Accessing ERP Data • Students and faculty need improved engagement and expect a stellar and seamless experience • Data from the ERP provides critical information about our students (financial aid, enrollment, graduation requirements, hiring, etc.) Campuses need data from our ERP • Current PeopleSoft ERP is highly customized making integrations costly and time consuming. • ERP is aging and we have grown! • 2001: 370,000 students with 40,000 faculty and staff • Today: 485,000 students with 56,000 faculty and staff • To accommodate ERP gaps we have an assortment of systems – Especially true in the Student Information Systems where student engagement systems have grown. • Limited resources 5
  • 6. Overview • Objective: CSU was looking to modernize how we process data and wanted to get our data “cloud ready” and “report ready” to take advantage of new capabilities that the cloud offered. • Problem: CSU data resides in on-prem legacy ERP (PeopleSoft/Oracle) systems
  • 7. Cloud Ready Repository (CMS Data Lake) • Repository of raw copies of source system data • Today that is largely CMS data sources with plans to include other sources over time (CSULearn/SumTotal, Person Data Management (PDM)) • Data is stored in Apache Parquet format • Intended to facilitate the use of cloud-based data tools using CMS data • Built to support the CSU Data Lake 7 Our Data Lakes
  • 8. Report Ready Repository (CSU Data Lake) • Fairly comprehensive data and reporting solution • Uses source data and transforms data into rich data collections • Data collections targeted towards reporting 8 Our Data Lakes
  • 9. © 2017 Unisys Corporation. All rights reserved. 9 ON-PREM PROD AWS CMS Data Lake - Data as a Service CMS Data Lake CSU Data Lake VPC Peering Students Students by Term Students by Class Students by Degree Classes by Section Applications by Applicant
  • 10. First Big Problem • Our biggest source of data is our ERP. They are on-prem and will stay there. • 47 production instances • Approx 1TB aggregate of “interesting” data. • Oracle RDBMS • OLTP Optimized How do we get this to the cloud?
  • 11. Delphix Data Virtualization Platform Data virtualization decouples the database layer that sits between the storage and application layers in the application stack. Just like a hypervisor sits between the server and the OS to create a virtual server, database virtualization software sits between the database and the OS to abstract/virtualize the data store resources. Because database resources are virtualized, they require a much smaller storage footprint than the source database. Instead of making and moving new blocks of data, virtual data (virtual data copies) use pointers to data blocks, providing high-performance access to data already in place. 11
  • 12. 12
  • 13. AWS Database Migration Service (DMS) AWS Database Migration Service (AWS DMS) helps you migrate databases to AWS quickly and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The AWS Database Migration Service can migrate your data to and from the most widely used commercial and open-source databases. 13
  • 14. © 2017 Unisys Corporation. All rights reserved. 14 ON-PREMISE PRODUCTION CS HR FIN Oracle DB on Ec2 Campus Specific S3 buckets Continues Replication over Secure VPN • Data Gets Copied in to Encrypted S3 buckets • Historical Data is stored in date wise S3 bucket folders AWS CLOUD DMS Instances DMS Tasks S3 Copy CMS Data Lake - Data as a Service
  • 15. 15
  翻译: