SlideShare a Scribd company logo
A TALE OF TWO SYSTEMS:
INSIGHTS FROM
SOFTWARE ARCHITECTURE
DAVID MAX
Senior Software Engineer
ABOUT LINKEDIN NEW YORK CITY
● Located in Empire State Building.
● Approximately 90 engineers and out of
about 1000 employees total.
● Multiple teams, front end, back end and
data science.
#nwd2018WHAT “TWO SYSTEMS”?
System 1
● A working system that is nearing the limits of its capacity.
System 2
● The replacement system designed to address the capacity issues.
○ Solves the capacity problem…
○ …but utterly fails in other ways.
ANTI-PATTERN
“A common response to a recurring
problem that is usually ineffective and
risks being highly counterproductive.”
– Wikipedia
“An antipattern is just like a pattern,
except that instead of a solution it gives
something that looks superficially like a
solution but isn’t one.”
– Andrew Koenig
COACH VS. ROOKIE
More powerful conceptual
models help us better make
sense of what we see.
WHAT THE COACH HAS IS...
“...a set of mental abstractions that allow him to convert his
perceptions of raw phenomena, such as a ball being passed, into a
condensed and integrated understanding of what is happening,
such as the success of an offensive strategy.
The coach watches the same game that the rookie does, but he
understands it better.”
– George Fairbanks, Just Enough Software Architecture
THINKING LIKE A
COACH -
CONCEPTUAL MODELS
“Software Architecture refers to the high
level structures of a software system, the
discipline of creating such structures,
and the documentation of these
structures. These structures are needed
to reason about the software system.”
– Wikipedia
“Software architecture is the set of design
decisions which, if made incorrectly, may
cause your project to be cancelled.”
― Eoin Woods
What is Software Architecture?
#nwd2018ARCHITECTURALLY SIGNIFICANT REQUIREMENTS (ASRs)
Constraints - Unchangeable design decisions, usually given, sometimes
chosen.
Quality Attributes - Externally visible properties that characterize how
the system operates in a specific context.
Influential Functional Requirements - Features and functions that
require special attention in the architecture.
Other Influencers - Time, knowledge, experience, skills, office politics,
your own geeky biases, and all the other stuff that sways your decision
making.
― Michael Keeling, Design It!
#nwd2018QUALITY ATTRIBUTES - STANDARD BLENDER
Pros:
● Powerful motor (550 Watts)
● Sits well on kitchen counter
● Dishwasher safe
Cons:
● Must be plugged in
● Limited portability
(example from Design It! by Michael Keeling)
#nwd2018CORDLESS RECHARGEABLE HAND BLENDER
Pros:
● Small, very portable
● Doesn’t need electric outlet to operate
● Very easy to clean
Cons:
● Less powerful (2.5 Watts)
● Needs to be recharged after 20 minutes
● Must hold in hand to operate
#nwd2018CHAINSAW BLENDER
Pros
● Portable, doesn’t need
electric outlet
● Powerful! (37cc gas-powered
engine)
Cons
● Tad loud
● Emits exhaust unsafe for
indoor use
● Not suitable for kitchen
countertop use
#nwd2018TAKEAWAYS
● Three solutions for accomplishing the same task
● Each solution promotes a different set of quality attributes
● Quality attributes often trade off against each other
● The “best” design depends on which properties are most highly valued
#nwd2018
Processing
AGGREGATION
Input files
Output file
#nwd2018OLD SYSTEM FLOW
#nwd2018OLD SYSTEM FLOW
#nwd2018PROBLEMS
● Aggregator terminates with an out-of-memory error on the
largest inputs.
● Task Manager shows there’s plenty of memory left.
● A single memory allocation is requesting well over 500MB at
once, and fails.
WHO NEEDS 500MB at once?
If there is plenty of memory left, why is it failing?
#nwd2018WIN32 PROCESS ADDRESS SPACE
2 GB
8000000
FFFFFFFF
0000000
System virtual address space.
Reserved for use by system.
0000000
2 GB
0000000
7FFFFFFF
Per-process virtual address space.
Available for use by applications
#nwd2018MEMORY MAPPED FILE
#nwd2018ADDRESS SPACE FRAGMENTATION
Even with plenty of memory available, fragmentation of the
address space means there’s not enough contiguous address space
to fit this new block:
#nwd2018COACHABLE MOMENT
● Don’t wait until your system is already blowing up.
● Some scaling problems can’t be solved by buying a bigger computer.
#nwd2018LET’S FIX IT!
Symptom: Aggregator is failing with an out-of-memory error.
Reason: Output file is too large to fit in a Win32 memory mapped file.
Analysis: Current implementation can’t scale beyond a certain size output.
Conclusion: We have a scalability problem.
Solution: Replace aggregation data store with a more scalable solution.
#nwd2018IN-MEMORY DISTRIBUTED DATA CACHE
#nwd2018NEW ARCHITECTURE HAS NICE NEW ATTRIBUTES
#nwd2018NEW ARCHITECTURE OFFERS NEW SCALABILITY OPTIONS
Increasing Scalability
#nwd2018OLD SYSTEM FLOW
#nwd2018NEW SYSTEM FLOW
#nwd2018RUN TIME PERFORMANCE (NIGHTLY BATCH)
#nwd2018ROOKIE MISTAKES
● Include all constraints
○ Fixated on scalability
○ Forgot that we also had important time constraint as well!
● Quality Attributes
○ Worried mainly about scalability, time to implement, and reducing
changes to other parts of the system.
○ Forgot that quality attributes trade off against each other, and did
not analyze to what extent scalability is an ASR.
● Other differences
○ Single process memory mapped files have different performance
characteristics from in-memory distributed data caches.
#nwd2018SIGNIFICANT DIFFERENCES
Scenario - Lots of workers writing to same record.
Memory Mapped File - Best performance because the memory page is
most likely to be in memory. Less likely to need to swap to disk.
File on Disk
Mapped
Address
Range
Memory PageCPU Cache
Worker
Worker
Worker
Worker
Worker
#nwd2018IN-MEMORY DISTRIBUTED CACHE
Scenario - Lots of workers writing to same record.
Worst performance when workers write to the
same record on different machines because of
node-to-node synchronization.
Node Node
NodeNode
Node Node
Worker
Worker
Worker
Worker
#nwd2018IN-MEMORY DISTRIBUTED CACHE
Scenario - Lots of workers writing to same node.
Poor performance because unable to distribute load.
Node Node
NodeNode
Node Node
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
#nwd2018MEMORY MAPPED FILE
Scenario - Every worker writes to a different record.
Worse performance, because fewer cache hits,
more page faults, and more disk I/O.
File on Disk
Mapped
Address
Range
Memory PageCPU Cache
Worker
Worker
Worker
Worker
Worker
Memory Page
Page Fault
#nwd2018IN-MEMORY DISTRIBUTED CACHE
Scenario - Records associated with particular nodes. Load distributed over nodes.
Best performance. Record locality minimizes node-to-node synchronization.
Distributing connections over the cluster promotes better scaling.
Node Node
NodeNode
Node Node
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
#nwd2018CONCLUSION
● Thinking about the architecture helps us better understand how what
we are building addresses the important requirements.
● Promoting one quality attribute usually involves some kind of tradeoff.
Software Engineering is the discipline of balancing tradeoffs.
● The architecture is the hardest thing to change after the fact, so it pays
to invest some time up front analyzing the ASRs.
● Don’t wait until your system is falling over to make needed changes.
Less time spent on the architecture up front often means more time
spent doing avoidable rework later.
Thank You!
linkedin.com/in/davidpmax
Ad

More Related Content

What's hot (9)

Becoming a Rock Star DBA
Becoming a Rock Star DBABecoming a Rock Star DBA
Becoming a Rock Star DBA
Sheeri Cabral
 
Using AWS, Eucalyptus and Chef for the Optimal Hybrid Cloud
Using AWS, Eucalyptus and Chef for the Optimal Hybrid CloudUsing AWS, Eucalyptus and Chef for the Optimal Hybrid Cloud
Using AWS, Eucalyptus and Chef for the Optimal Hybrid Cloud
dboze
 
Branch Office Infrastructure
Branch Office InfrastructureBranch Office Infrastructure
Branch Office Infrastructure
Aidan Finn
 
Citrix XenDesktop: Dealing with Failure - SYN408
Citrix XenDesktop: Dealing with Failure - SYN408Citrix XenDesktop: Dealing with Failure - SYN408
Citrix XenDesktop: Dealing with Failure - SYN408
Tom Gamull
 
Dileep-Resume
Dileep-ResumeDileep-Resume
Dileep-Resume
Dileep Kumar
 
5 Ways Your Backup Design Can Impact Virtualized Data Protection
5 Ways Your Backup Design Can Impact Virtualized Data Protection5 Ways Your Backup Design Can Impact Virtualized Data Protection
5 Ways Your Backup Design Can Impact Virtualized Data Protection
Storage Switzerland
 
DBTA Data Summit : Eliminating the data constraint in Application Development
DBTA Data Summit : Eliminating the data constraint in Application DevelopmentDBTA Data Summit : Eliminating the data constraint in Application Development
DBTA Data Summit : Eliminating the data constraint in Application Development
Kyle Hailey
 
Software Process... the good parts
Software Process... the good partsSoftware Process... the good parts
Software Process... the good parts
Andrew Shafer
 
VMWare Winnipeg Forum - 2011
VMWare Winnipeg Forum - 2011VMWare Winnipeg Forum - 2011
VMWare Winnipeg Forum - 2011
asedha
 
Becoming a Rock Star DBA
Becoming a Rock Star DBABecoming a Rock Star DBA
Becoming a Rock Star DBA
Sheeri Cabral
 
Using AWS, Eucalyptus and Chef for the Optimal Hybrid Cloud
Using AWS, Eucalyptus and Chef for the Optimal Hybrid CloudUsing AWS, Eucalyptus and Chef for the Optimal Hybrid Cloud
Using AWS, Eucalyptus and Chef for the Optimal Hybrid Cloud
dboze
 
Branch Office Infrastructure
Branch Office InfrastructureBranch Office Infrastructure
Branch Office Infrastructure
Aidan Finn
 
Citrix XenDesktop: Dealing with Failure - SYN408
Citrix XenDesktop: Dealing with Failure - SYN408Citrix XenDesktop: Dealing with Failure - SYN408
Citrix XenDesktop: Dealing with Failure - SYN408
Tom Gamull
 
5 Ways Your Backup Design Can Impact Virtualized Data Protection
5 Ways Your Backup Design Can Impact Virtualized Data Protection5 Ways Your Backup Design Can Impact Virtualized Data Protection
5 Ways Your Backup Design Can Impact Virtualized Data Protection
Storage Switzerland
 
DBTA Data Summit : Eliminating the data constraint in Application Development
DBTA Data Summit : Eliminating the data constraint in Application DevelopmentDBTA Data Summit : Eliminating the data constraint in Application Development
DBTA Data Summit : Eliminating the data constraint in Application Development
Kyle Hailey
 
Software Process... the good parts
Software Process... the good partsSoftware Process... the good parts
Software Process... the good parts
Andrew Shafer
 
VMWare Winnipeg Forum - 2011
VMWare Winnipeg Forum - 2011VMWare Winnipeg Forum - 2011
VMWare Winnipeg Forum - 2011
asedha
 

Similar to A Tale of Two Systems - Insights from Software Architecture (20)

Solving the Database Problem
Solving the Database ProblemSolving the Database Problem
Solving the Database Problem
Jay Gordon
 
NoSQL and ACID
NoSQL and ACIDNoSQL and ACID
NoSQL and ACID
FoundationDB
 
AWS User Group October
AWS User Group OctoberAWS User Group October
AWS User Group October
PolarSeven Pty Ltd
 
Big data nyu
Big data nyuBig data nyu
Big data nyu
Edward Capriolo
 
Elephant grooming: quality with Hadoop
Elephant grooming: quality with HadoopElephant grooming: quality with Hadoop
Elephant grooming: quality with Hadoop
Roman Nikitchenko
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
Santanu Dey
 
Cloud arch patterns
Cloud arch patternsCloud arch patterns
Cloud arch patterns
Corey Huinker
 
Disrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Disrupting the Storage Industry talk at SNIA Data Storage Innovation ConferenceDisrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Disrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Adrian Cockcroft
 
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Sergey Platonov
 
Webinar: Overcoming the Storage Roadblock to Data Center Modernization
Webinar: Overcoming the Storage Roadblock to Data Center ModernizationWebinar: Overcoming the Storage Roadblock to Data Center Modernization
Webinar: Overcoming the Storage Roadblock to Data Center Modernization
Storage Switzerland
 
Tales from the Field
Tales from the FieldTales from the Field
Tales from the Field
MongoDB
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
Dylan Tong
 
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
VMworld
 
Automatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI PlanningAutomatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI Planning
Hiroshi Wada
 
Altitude SF 2017: Reddit - How we built and scaled r/place
Altitude SF 2017: Reddit - How we built and scaled r/placeAltitude SF 2017: Reddit - How we built and scaled r/place
Altitude SF 2017: Reddit - How we built and scaled r/place
Fastly
 
Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture
corehard_by
 
Big Data: fall seven times, stand up eight!
Big Data: fall seven times, stand up eight!Big Data: fall seven times, stand up eight!
Big Data: fall seven times, stand up eight!
Roman Nikitchenko
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
Big Data Montreal
 
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdfUsing ScyllaDB for Real-Time Read-Heavy Workloads.pdf
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
ScyllaDB
 
VMworld 2013: Re-imagining VDI Design: New Strategies for Solving VDI Challen...
VMworld 2013: Re-imagining VDI Design: New Strategies for Solving VDI Challen...VMworld 2013: Re-imagining VDI Design: New Strategies for Solving VDI Challen...
VMworld 2013: Re-imagining VDI Design: New Strategies for Solving VDI Challen...
VMworld
 
Solving the Database Problem
Solving the Database ProblemSolving the Database Problem
Solving the Database Problem
Jay Gordon
 
Elephant grooming: quality with Hadoop
Elephant grooming: quality with HadoopElephant grooming: quality with Hadoop
Elephant grooming: quality with Hadoop
Roman Nikitchenko
 
Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
Santanu Dey
 
Disrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Disrupting the Storage Industry talk at SNIA Data Storage Innovation ConferenceDisrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Disrupting the Storage Industry talk at SNIA Data Storage Innovation Conference
Adrian Cockcroft
 
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...Dori Exterman, Considerations for choosing the parallel computing strategy th...
Dori Exterman, Considerations for choosing the parallel computing strategy th...
Sergey Platonov
 
Webinar: Overcoming the Storage Roadblock to Data Center Modernization
Webinar: Overcoming the Storage Roadblock to Data Center ModernizationWebinar: Overcoming the Storage Roadblock to Data Center Modernization
Webinar: Overcoming the Storage Roadblock to Data Center Modernization
Storage Switzerland
 
Tales from the Field
Tales from the FieldTales from the Field
Tales from the Field
MongoDB
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
Dylan Tong
 
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
VMworld 2013: Low-Cost, High-Performance Storage for VMware Horizon Desktops
VMworld
 
Automatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI PlanningAutomatic Undo for Cloud Management via AI Planning
Automatic Undo for Cloud Management via AI Planning
Hiroshi Wada
 
Altitude SF 2017: Reddit - How we built and scaled r/place
Altitude SF 2017: Reddit - How we built and scaled r/placeAltitude SF 2017: Reddit - How we built and scaled r/place
Altitude SF 2017: Reddit - How we built and scaled r/place
Fastly
 
Choosing the right parallel compute architecture
Choosing the right parallel compute architecture Choosing the right parallel compute architecture
Choosing the right parallel compute architecture
corehard_by
 
Big Data: fall seven times, stand up eight!
Big Data: fall seven times, stand up eight!Big Data: fall seven times, stand up eight!
Big Data: fall seven times, stand up eight!
Roman Nikitchenko
 
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...
Big Data Montreal
 
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdfUsing ScyllaDB for Real-Time Read-Heavy Workloads.pdf
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
ScyllaDB
 
VMworld 2013: Re-imagining VDI Design: New Strategies for Solving VDI Challen...
VMworld 2013: Re-imagining VDI Design: New Strategies for Solving VDI Challen...VMworld 2013: Re-imagining VDI Design: New Strategies for Solving VDI Challen...
VMworld 2013: Re-imagining VDI Design: New Strategies for Solving VDI Challen...
VMworld
 
Ad

Recently uploaded (20)

2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt
rakshaiya16
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdfLittle Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
gori42199
 
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
Guru Nanak Technical Institutions
 
Agents chapter of Artificial intelligence
Agents chapter of Artificial intelligenceAgents chapter of Artificial intelligence
Agents chapter of Artificial intelligence
DebdeepMukherjee9
 
Nanometer Metal-Organic-Framework Literature Comparison
Nanometer Metal-Organic-Framework  Literature ComparisonNanometer Metal-Organic-Framework  Literature Comparison
Nanometer Metal-Organic-Framework Literature Comparison
Chris Harding
 
JRR Tolkien’s Lord of the Rings: Was It Influenced by Nordic Mythology, Homer...
JRR Tolkien’s Lord of the Rings: Was It Influenced by Nordic Mythology, Homer...JRR Tolkien’s Lord of the Rings: Was It Influenced by Nordic Mythology, Homer...
JRR Tolkien’s Lord of the Rings: Was It Influenced by Nordic Mythology, Homer...
Reflections on Morality, Philosophy, and History
 
Machine Learning basics POWERPOINT PRESENETATION
Machine Learning basics POWERPOINT PRESENETATIONMachine Learning basics POWERPOINT PRESENETATION
Machine Learning basics POWERPOINT PRESENETATION
DarrinBright1
 
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
ajayrm685
 
David Boutry - Specializes In AWS, Microservices And Python.pdf
David Boutry - Specializes In AWS, Microservices And Python.pdfDavid Boutry - Specializes In AWS, Microservices And Python.pdf
David Boutry - Specializes In AWS, Microservices And Python.pdf
David Boutry
 
Autodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User InterfaceAutodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User Interface
Atif Razi
 
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdfML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
rameshwarchintamani
 
Generative AI & Large Language Models Agents
Generative AI & Large Language Models AgentsGenerative AI & Large Language Models Agents
Generative AI & Large Language Models Agents
aasgharbee22seecs
 
Applications of Centroid in Structural Engineering
Applications of Centroid in Structural EngineeringApplications of Centroid in Structural Engineering
Applications of Centroid in Structural Engineering
suvrojyotihalder2006
 
introduction technology technology tec.pptx
introduction technology technology tec.pptxintroduction technology technology tec.pptx
introduction technology technology tec.pptx
Iftikhar70
 
Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control Monthly May 2025Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control Monthly May 2025
Water Industry Process Automation & Control
 
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
ijflsjournal087
 
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdfML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
rameshwarchintamani
 
ATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ATAL 6 Days Online FDP Scheme Document 2025-26.pdfATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ssuserda39791
 
DED KOMINFO detail engginering design gedung
DED KOMINFO detail engginering design gedungDED KOMINFO detail engginering design gedung
DED KOMINFO detail engginering design gedung
nabilarizqifadhilah1
 
2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt2.3 Genetically Modified Organisms (1).ppt
2.3 Genetically Modified Organisms (1).ppt
rakshaiya16
 
Slide share PPT of NOx control technologies.pptx
Slide share PPT of  NOx control technologies.pptxSlide share PPT of  NOx control technologies.pptx
Slide share PPT of NOx control technologies.pptx
vvsasane
 
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdfLittle Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
Little Known Ways To 3 Best sites to Buy Linkedin Accounts.pdf
gori42199
 
Agents chapter of Artificial intelligence
Agents chapter of Artificial intelligenceAgents chapter of Artificial intelligence
Agents chapter of Artificial intelligence
DebdeepMukherjee9
 
Nanometer Metal-Organic-Framework Literature Comparison
Nanometer Metal-Organic-Framework  Literature ComparisonNanometer Metal-Organic-Framework  Literature Comparison
Nanometer Metal-Organic-Framework Literature Comparison
Chris Harding
 
Machine Learning basics POWERPOINT PRESENETATION
Machine Learning basics POWERPOINT PRESENETATIONMachine Learning basics POWERPOINT PRESENETATION
Machine Learning basics POWERPOINT PRESENETATION
DarrinBright1
 
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
sss1.pptxsss1.pptxsss1.pptxsss1.pptxsss1.pptx
ajayrm685
 
David Boutry - Specializes In AWS, Microservices And Python.pdf
David Boutry - Specializes In AWS, Microservices And Python.pdfDavid Boutry - Specializes In AWS, Microservices And Python.pdf
David Boutry - Specializes In AWS, Microservices And Python.pdf
David Boutry
 
Autodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User InterfaceAutodesk Fusion 2025 Tutorial: User Interface
Autodesk Fusion 2025 Tutorial: User Interface
Atif Razi
 
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdfML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
ML_Unit_V_RDC_ASSOCIATION AND DIMENSIONALITY REDUCTION.pdf
rameshwarchintamani
 
Generative AI & Large Language Models Agents
Generative AI & Large Language Models AgentsGenerative AI & Large Language Models Agents
Generative AI & Large Language Models Agents
aasgharbee22seecs
 
Applications of Centroid in Structural Engineering
Applications of Centroid in Structural EngineeringApplications of Centroid in Structural Engineering
Applications of Centroid in Structural Engineering
suvrojyotihalder2006
 
introduction technology technology tec.pptx
introduction technology technology tec.pptxintroduction technology technology tec.pptx
introduction technology technology tec.pptx
Iftikhar70
 
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
6th International Conference on Big Data, Machine Learning and IoT (BMLI 2025)
ijflsjournal087
 
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdfML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
ML_Unit_VI_DEEP LEARNING_Introduction to ANN.pdf
rameshwarchintamani
 
ATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ATAL 6 Days Online FDP Scheme Document 2025-26.pdfATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ATAL 6 Days Online FDP Scheme Document 2025-26.pdf
ssuserda39791
 
DED KOMINFO detail engginering design gedung
DED KOMINFO detail engginering design gedungDED KOMINFO detail engginering design gedung
DED KOMINFO detail engginering design gedung
nabilarizqifadhilah1
 
Ad

A Tale of Two Systems - Insights from Software Architecture

  • 1. A TALE OF TWO SYSTEMS: INSIGHTS FROM SOFTWARE ARCHITECTURE DAVID MAX Senior Software Engineer
  • 2. ABOUT LINKEDIN NEW YORK CITY ● Located in Empire State Building. ● Approximately 90 engineers and out of about 1000 employees total. ● Multiple teams, front end, back end and data science.
  • 3. #nwd2018WHAT “TWO SYSTEMS”? System 1 ● A working system that is nearing the limits of its capacity. System 2 ● The replacement system designed to address the capacity issues. ○ Solves the capacity problem… ○ …but utterly fails in other ways.
  • 4. ANTI-PATTERN “A common response to a recurring problem that is usually ineffective and risks being highly counterproductive.” – Wikipedia “An antipattern is just like a pattern, except that instead of a solution it gives something that looks superficially like a solution but isn’t one.” – Andrew Koenig
  • 5. COACH VS. ROOKIE More powerful conceptual models help us better make sense of what we see.
  • 6. WHAT THE COACH HAS IS... “...a set of mental abstractions that allow him to convert his perceptions of raw phenomena, such as a ball being passed, into a condensed and integrated understanding of what is happening, such as the success of an offensive strategy. The coach watches the same game that the rookie does, but he understands it better.” – George Fairbanks, Just Enough Software Architecture
  • 7. THINKING LIKE A COACH - CONCEPTUAL MODELS “Software Architecture refers to the high level structures of a software system, the discipline of creating such structures, and the documentation of these structures. These structures are needed to reason about the software system.” – Wikipedia “Software architecture is the set of design decisions which, if made incorrectly, may cause your project to be cancelled.” ― Eoin Woods What is Software Architecture?
  • 8. #nwd2018ARCHITECTURALLY SIGNIFICANT REQUIREMENTS (ASRs) Constraints - Unchangeable design decisions, usually given, sometimes chosen. Quality Attributes - Externally visible properties that characterize how the system operates in a specific context. Influential Functional Requirements - Features and functions that require special attention in the architecture. Other Influencers - Time, knowledge, experience, skills, office politics, your own geeky biases, and all the other stuff that sways your decision making. ― Michael Keeling, Design It!
  • 9. #nwd2018QUALITY ATTRIBUTES - STANDARD BLENDER Pros: ● Powerful motor (550 Watts) ● Sits well on kitchen counter ● Dishwasher safe Cons: ● Must be plugged in ● Limited portability (example from Design It! by Michael Keeling)
  • 10. #nwd2018CORDLESS RECHARGEABLE HAND BLENDER Pros: ● Small, very portable ● Doesn’t need electric outlet to operate ● Very easy to clean Cons: ● Less powerful (2.5 Watts) ● Needs to be recharged after 20 minutes ● Must hold in hand to operate
  • 11. #nwd2018CHAINSAW BLENDER Pros ● Portable, doesn’t need electric outlet ● Powerful! (37cc gas-powered engine) Cons ● Tad loud ● Emits exhaust unsafe for indoor use ● Not suitable for kitchen countertop use
  • 12. #nwd2018TAKEAWAYS ● Three solutions for accomplishing the same task ● Each solution promotes a different set of quality attributes ● Quality attributes often trade off against each other ● The “best” design depends on which properties are most highly valued
  • 16. #nwd2018PROBLEMS ● Aggregator terminates with an out-of-memory error on the largest inputs. ● Task Manager shows there’s plenty of memory left. ● A single memory allocation is requesting well over 500MB at once, and fails. WHO NEEDS 500MB at once? If there is plenty of memory left, why is it failing?
  • 17. #nwd2018WIN32 PROCESS ADDRESS SPACE 2 GB 8000000 FFFFFFFF 0000000 System virtual address space. Reserved for use by system. 0000000 2 GB 0000000 7FFFFFFF Per-process virtual address space. Available for use by applications
  • 19. #nwd2018ADDRESS SPACE FRAGMENTATION Even with plenty of memory available, fragmentation of the address space means there’s not enough contiguous address space to fit this new block:
  • 20. #nwd2018COACHABLE MOMENT ● Don’t wait until your system is already blowing up. ● Some scaling problems can’t be solved by buying a bigger computer.
  • 21. #nwd2018LET’S FIX IT! Symptom: Aggregator is failing with an out-of-memory error. Reason: Output file is too large to fit in a Win32 memory mapped file. Analysis: Current implementation can’t scale beyond a certain size output. Conclusion: We have a scalability problem. Solution: Replace aggregation data store with a more scalable solution.
  • 23. #nwd2018NEW ARCHITECTURE HAS NICE NEW ATTRIBUTES
  • 24. #nwd2018NEW ARCHITECTURE OFFERS NEW SCALABILITY OPTIONS Increasing Scalability
  • 27. #nwd2018RUN TIME PERFORMANCE (NIGHTLY BATCH)
  • 28. #nwd2018ROOKIE MISTAKES ● Include all constraints ○ Fixated on scalability ○ Forgot that we also had important time constraint as well! ● Quality Attributes ○ Worried mainly about scalability, time to implement, and reducing changes to other parts of the system. ○ Forgot that quality attributes trade off against each other, and did not analyze to what extent scalability is an ASR. ● Other differences ○ Single process memory mapped files have different performance characteristics from in-memory distributed data caches.
  • 29. #nwd2018SIGNIFICANT DIFFERENCES Scenario - Lots of workers writing to same record. Memory Mapped File - Best performance because the memory page is most likely to be in memory. Less likely to need to swap to disk. File on Disk Mapped Address Range Memory PageCPU Cache Worker Worker Worker Worker Worker
  • 30. #nwd2018IN-MEMORY DISTRIBUTED CACHE Scenario - Lots of workers writing to same record. Worst performance when workers write to the same record on different machines because of node-to-node synchronization. Node Node NodeNode Node Node Worker Worker Worker Worker
  • 31. #nwd2018IN-MEMORY DISTRIBUTED CACHE Scenario - Lots of workers writing to same node. Poor performance because unable to distribute load. Node Node NodeNode Node Node Worker Worker Worker Worker Worker Worker Worker Worker
  • 32. #nwd2018MEMORY MAPPED FILE Scenario - Every worker writes to a different record. Worse performance, because fewer cache hits, more page faults, and more disk I/O. File on Disk Mapped Address Range Memory PageCPU Cache Worker Worker Worker Worker Worker Memory Page Page Fault
  • 33. #nwd2018IN-MEMORY DISTRIBUTED CACHE Scenario - Records associated with particular nodes. Load distributed over nodes. Best performance. Record locality minimizes node-to-node synchronization. Distributing connections over the cluster promotes better scaling. Node Node NodeNode Node Node Worker Worker Worker Worker Worker Worker Worker Worker Worker
  • 34. #nwd2018CONCLUSION ● Thinking about the architecture helps us better understand how what we are building addresses the important requirements. ● Promoting one quality attribute usually involves some kind of tradeoff. Software Engineering is the discipline of balancing tradeoffs. ● The architecture is the hardest thing to change after the fact, so it pays to invest some time up front analyzing the ASRs. ● Don’t wait until your system is falling over to make needed changes. Less time spent on the architecture up front often means more time spent doing avoidable rework later.
  翻译: