PyData Amsterdam - Name Matching at ScaleGoDataDriven
Wendell Kuling works as a Data Scientist at ING in the Wholesale Banking Advanced Analytics team. Their projects aim to provide better services to corporate customers of ING, by using innovative techniques from data-science. In this talk, Wendell covers key insights from their experience in matching large datasets based on names. After covering the key algorithms and packages ING uses for name matching, Wendell will share his best-practice approach in applying these algorithms at scale… would you bet on a Cruncher (48-CPU/512 MB RAM machine), a Tesla (Cuda Tesla K80 with 4992 cores, 24GB memory) or a Spark cluster (80 cores/2,5 TB memory)?
The document provides tips for developing winning federal proposals. It emphasizes focusing on the customer's needs, customizing the proposal to the specific opportunity, and using a consistent and concise writing style. Key recommendations include putting the customer first, demonstrating a commitment to jointly achieving objectives, tailoring the solution and language to the requester, using compelling and creative elements like graphics and examples, and ensuring technical and political correctness.
This document provides an introduction to genetic algorithms. It explains that genetic algorithms are inspired by Darwinian evolution and use processes like selection, crossover and mutation to iteratively improve a population of potential solutions. It discusses how genetic algorithms can be used for optimization problems and classification in data mining. Examples of genetic algorithm applications like the traveling salesman problem are also presented to illustrate genetic algorithm concepts and processes.
Yoav Goldberg: Word Embeddings What, How and WhitherMLReview
This document discusses word embeddings and how they work. It begins by explaining how the author became an expert in distributional semantics without realizing it. It then discusses how word2vec works, specifically skip-gram models with negative sampling. The key points are that word2vec is learning word and context vectors such that related words and contexts have similar vectors, and that this is implicitly factorizing the word-context pointwise mutual information matrix. Later sections discuss how hyperparameters are important to word2vec's success and provide critiques of common evaluation tasks like word analogies that don't capture true semantic similarity. The overall message is that word embeddings are fundamentally doing the same thing as older distributional semantic models through matrix factorization.
This document discusses Ask.com's challenge of determining which search queries deserve editorial answers. It presents Ask.com's hybrid approach which first filters out queries that are obviously not suitable for editorial answers. It then uses dedicated classifiers and machine learning to further filter queries, with any low confidence queries sent for human review. This reduces the workload for human reviewers by 97% compared to no filtering. The approach improves the machine learning model's accuracy by focusing its domain and allows it to gradually improve using human ratings as training data. Certain human rater biases are also discussed, showing how pre-filtering data can improve the reliability of human reviews.
Introduction, Terminology and concepts, Introduction to statistics, Central tendencies and distributions, Variance, Distribution properties and arithmetic, Samples/CLT, Basic machine learning algorithms, Linear regression, SVM, Naive Bayes
MLSEV Virtual. Supervised vs UnsupervisedBigML, Inc
Supervised vs Unsupervised Learning Techniques, by Charles Parker, Vice President of Machine Learning algorithms at BigML.
*MLSEV 2020: Virtual Conference.
How to Determine CLIENT LIFETIME VALUE in Five MinutesService Autopilot
Knowing your Client Lifetime Value will help you:
• Know how much to spend to acquire more clients.
• Know how much to spend to keep existing clients.
• “See” how much your cleaning business is really worth.
The document discusses local search algorithms like hill-climbing for solving optimization problems. It explains that hill-climbing iteratively moves to successor states with improved evaluations until a local optimum is reached. However, hill-climbing often gets stuck in local optima and fails to find global optima. The document proposes methods like allowing sideways moves, random restarts, and stochastic selection to help hill-climbing escape local optima and improve performance.
Estimation is associated with Fear, Uncertainty and Death marches. Most of us would rather not estimate. Yet, sometimes we do need estimates and commitments, even on "estimation-less" projects. Play a series of estimation games to experience how different techniques deliver very different results. Learn a few simple rules that turn you into a reliable estimator. But correct estimates aren't enough. See what else is required to deliver on your promises. Learn to deal with the destructive games people play with estimates. Estimating can be Fun, embracing Uncertainty and Delivering.
Why dashboard design should be (but usually never is) based on cognitive scie...UXPA International
The document discusses how dashboard design is often not based on principles of cognitive science, which results in dashboards being less effective than they could be. It advocates applying knowledge of human visual perception and quantitative judgment to dashboard design by thinking like a translator to communicate data in a way the human brain can easily understand. The document provides examples of how color, size, and motion influence human perception differently and suggests dashboard designers consider these factors to improve comprehension of data visualizations.
DevOps Enterprise Summit Las Vegas 2018: The Problem of Becoming a 3rd-Line S...Jon Stevens-Hall
The document discusses how swarming is a better approach than traditional tiered support structures for DevOps teams. It describes how BMC implemented swarming, including severity 1 swarms for urgent issues and backlog swarms to address long-standing tickets. Swarming improved BMC's key metrics like resolution time and customer satisfaction. The document also notes challenges with swarming and how the approach aligns with DevOps practices like knowledge sharing and preventing burnout.
This document provides an introduction to genetic algorithms. It discusses that genetic algorithms are inspired by Darwinian evolution and use processes like selection, crossover and mutation to evolve solutions to problems. It also provides examples of how genetic algorithms can be used for optimization problems and classification in data mining. The key steps of a genetic algorithm including initializing a population, evaluating fitness, selection, crossover and mutation are outlined.
This document discusses different types of data and statistical concepts. It begins by describing the major types of data: numerical, categorical, and ordinal. Numerical data represents quantitative measurements, categorical data has no inherent mathematical meaning, and ordinal data has categorical categories with a mathematical order. It then discusses statistical measures like the mean, median, mode, standard deviation, variance, percentiles, moments, covariance, correlation, conditional probability, and Bayes' theorem. Examples are provided to help explain each concept.
The document discusses data-oriented design principles for game engine development in C++. It emphasizes understanding how data is represented and used to solve problems, rather than focusing on writing code. It provides examples of how restructuring code to better utilize data locality and cache lines can significantly improve performance by reducing cache misses. Booleans packed into structures are identified as having extremely low information density, wasting cache space.
This document provides an introduction to learning how to think like a coder. It discusses reasons for learning to code even if you are not a computer expert, such as that it teaches problem solving skills. It then provides examples of coding scenarios and algorithms to illustrate computational thinking. These include a grocery shopping scenario, math word problems, sorting algorithms, stable marriage algorithms, and traveling salesman problems. It also discusses logic structures used in coding like if/then statements. Finally, it proposes some group activities around writing algorithms for tasks like dances, paper planes, and driverless cars.
Customer satisfaction for co.opmart’s customer service in ho chi minhHỗ Trợ SPSS
The document is a survey that asks customers of Co.opmart supermarket in Ho Chi Minh City about their satisfaction with the customer service. It collects demographic information and asks customers to rate their agreement with statements about various aspects of Co.opmart's customer service, including interactions with employees, reliability, physical design of service areas, problem solving abilities, customer relationship policies, and overall satisfaction. The survey aims to measure customer satisfaction for Co.opmart's customer service in order to assist a student's MBA thesis.
Graph theory could potentially make a big impact on how we conduct businesses. Imagine the case where you wish to maximize the reach of your promotion via leveraging your customers' influence, to advocate your products and bring their friends on board. The same logic of harnessing one's networks can be applied to purchase recommendation, customer behavior, and fraud detection.
Running analyses on large graphs was not trivial for many companies - until recently. The field has made significant steps in the last five years and scalable graph computations are now the norm. You can now run graph computations out-of-core (no memory constraints) and in parallel (multiple machines), especially in Spark which is spreading like wildfire.
A lot of people are familiar with graphX, a pretty solid implementation of scalable graphs in Spark. GraphX is pretty interesting but the project seems to be orphaned. The good news is, there is now an alternative: Graphframes. They are a new data structure that takes the best parts of dataframes and graphs
In this talk, I will be explaining how to use Graphframes from Python, a new data structure in Spark 2.0 that takes the best parts of dataframes and graphs, with an example using personalized pagerank for recommendations.
Data Con LA 2022 - Real world consumer segmentationData Con LA
Jaysen Gillespie, Head of Analytics and Data Science at RTB House
1. Shopkick has over 30M downloads, but the userbase is very heterogeneous. Anecdotal evidence indicated a wide variety of users for whom the app holds long-term appeal.
2. Marketing and other teams challenged Analytics to get beyond basic summary statistics and develop a holistic segmentation of the userbase.
3. Shopkick's data science team used SQL and python to gather data, clean data, and then perform a data-driven segmentation using a k-means algorithm.
4. Interpreting the results is more work -- and more fun -- than running the algo itself. We'll discuss how we transform from ""segment 1"", ""segment 2"", etc. to something that non-analytics users (Marketing, Operations, etc.) could actually benefit from.
5. So what? How did team across Shopkick change their approach given what Analytics had discovered.
Territory Assignment Innovation: High-Velocity Techniques to Maximize Sales with Gusto’s CRO and Head of GTM Ops
Speakers: Tolithia Kornweibel, CRO @ Gusto and Jamie Edwards, Head of Go-to-Market Operations and Tools @ Gusto
1. The document discusses best practices for estimating projects and tasks. It emphasizes using ranges rather than specific numbers for estimates since estimation involves uncertainty.
2. Ten key principles of estimation are outlined, including always asking how the estimate will be used, not negotiating estimates, and using measured past performance to calibrate estimates. Aggregating independent estimates and decomposing work into around 15 tasks can improve accuracy by reducing risk.
3. Two short exercises are presented where participants estimate values and dates. Correct answers are then provided along with commentary on estimation techniques. The document promotes solving problems collaboratively and being transparent about assumptions in estimates.
Introductory presentation to Explainable AI, defending its main motivations and importance. We describe briefly the main techniques available in March 2020 and share many references to allow the reader to continue his/her studies.
AI-proof your career by Olivier Vroom and David WIlliamsonUXPA Boston
This talk explores the evolving role of AI in UX design and the ongoing debate about whether AI might replace UX professionals. The discussion will explore how AI is shaping workflows, where human skills remain essential, and how designers can adapt. Attendees will gain insights into the ways AI can enhance creativity, streamline processes, and create new challenges for UX professionals.
AI’s influence on UX is growing, from automating research analysis to generating design prototypes. While some believe AI could make most workers (including designers) obsolete, AI can also be seen as an enhancement rather than a replacement. This session, featuring two speakers, will examine both perspectives and provide practical ideas for integrating AI into design workflows, developing AI literacy, and staying adaptable as the field continues to change.
The session will include a relatively long guided Q&A and discussion section, encouraging attendees to philosophize, share reflections, and explore open-ended questions about AI’s long-term impact on the UX profession.
Dark Dynamism: drones, dark factories and deurbanizationJakub Šimek
Startup villages are the next frontier on the road to network states. This book aims to serve as a practical guide to bootstrap a desired future that is both definite and optimistic, to quote Peter Thiel’s framework.
Dark Dynamism is my second book, a kind of sequel to Bespoke Balajisms I published on Kindle in 2024. The first book was about 90 ideas of Balaji Srinivasan and 10 of my own concepts, I built on top of his thinking.
In Dark Dynamism, I focus on my ideas I played with over the last 8 years, inspired by Balaji Srinivasan, Alexander Bard and many people from the Game B and IDW scenes.
Ad
More Related Content
Similar to Overview of Genetic Algorithms in Computer Science (20)
The document discusses local search algorithms like hill-climbing for solving optimization problems. It explains that hill-climbing iteratively moves to successor states with improved evaluations until a local optimum is reached. However, hill-climbing often gets stuck in local optima and fails to find global optima. The document proposes methods like allowing sideways moves, random restarts, and stochastic selection to help hill-climbing escape local optima and improve performance.
Estimation is associated with Fear, Uncertainty and Death marches. Most of us would rather not estimate. Yet, sometimes we do need estimates and commitments, even on "estimation-less" projects. Play a series of estimation games to experience how different techniques deliver very different results. Learn a few simple rules that turn you into a reliable estimator. But correct estimates aren't enough. See what else is required to deliver on your promises. Learn to deal with the destructive games people play with estimates. Estimating can be Fun, embracing Uncertainty and Delivering.
Why dashboard design should be (but usually never is) based on cognitive scie...UXPA International
The document discusses how dashboard design is often not based on principles of cognitive science, which results in dashboards being less effective than they could be. It advocates applying knowledge of human visual perception and quantitative judgment to dashboard design by thinking like a translator to communicate data in a way the human brain can easily understand. The document provides examples of how color, size, and motion influence human perception differently and suggests dashboard designers consider these factors to improve comprehension of data visualizations.
DevOps Enterprise Summit Las Vegas 2018: The Problem of Becoming a 3rd-Line S...Jon Stevens-Hall
The document discusses how swarming is a better approach than traditional tiered support structures for DevOps teams. It describes how BMC implemented swarming, including severity 1 swarms for urgent issues and backlog swarms to address long-standing tickets. Swarming improved BMC's key metrics like resolution time and customer satisfaction. The document also notes challenges with swarming and how the approach aligns with DevOps practices like knowledge sharing and preventing burnout.
This document provides an introduction to genetic algorithms. It discusses that genetic algorithms are inspired by Darwinian evolution and use processes like selection, crossover and mutation to evolve solutions to problems. It also provides examples of how genetic algorithms can be used for optimization problems and classification in data mining. The key steps of a genetic algorithm including initializing a population, evaluating fitness, selection, crossover and mutation are outlined.
This document discusses different types of data and statistical concepts. It begins by describing the major types of data: numerical, categorical, and ordinal. Numerical data represents quantitative measurements, categorical data has no inherent mathematical meaning, and ordinal data has categorical categories with a mathematical order. It then discusses statistical measures like the mean, median, mode, standard deviation, variance, percentiles, moments, covariance, correlation, conditional probability, and Bayes' theorem. Examples are provided to help explain each concept.
The document discusses data-oriented design principles for game engine development in C++. It emphasizes understanding how data is represented and used to solve problems, rather than focusing on writing code. It provides examples of how restructuring code to better utilize data locality and cache lines can significantly improve performance by reducing cache misses. Booleans packed into structures are identified as having extremely low information density, wasting cache space.
This document provides an introduction to learning how to think like a coder. It discusses reasons for learning to code even if you are not a computer expert, such as that it teaches problem solving skills. It then provides examples of coding scenarios and algorithms to illustrate computational thinking. These include a grocery shopping scenario, math word problems, sorting algorithms, stable marriage algorithms, and traveling salesman problems. It also discusses logic structures used in coding like if/then statements. Finally, it proposes some group activities around writing algorithms for tasks like dances, paper planes, and driverless cars.
Customer satisfaction for co.opmart’s customer service in ho chi minhHỗ Trợ SPSS
The document is a survey that asks customers of Co.opmart supermarket in Ho Chi Minh City about their satisfaction with the customer service. It collects demographic information and asks customers to rate their agreement with statements about various aspects of Co.opmart's customer service, including interactions with employees, reliability, physical design of service areas, problem solving abilities, customer relationship policies, and overall satisfaction. The survey aims to measure customer satisfaction for Co.opmart's customer service in order to assist a student's MBA thesis.
Graph theory could potentially make a big impact on how we conduct businesses. Imagine the case where you wish to maximize the reach of your promotion via leveraging your customers' influence, to advocate your products and bring their friends on board. The same logic of harnessing one's networks can be applied to purchase recommendation, customer behavior, and fraud detection.
Running analyses on large graphs was not trivial for many companies - until recently. The field has made significant steps in the last five years and scalable graph computations are now the norm. You can now run graph computations out-of-core (no memory constraints) and in parallel (multiple machines), especially in Spark which is spreading like wildfire.
A lot of people are familiar with graphX, a pretty solid implementation of scalable graphs in Spark. GraphX is pretty interesting but the project seems to be orphaned. The good news is, there is now an alternative: Graphframes. They are a new data structure that takes the best parts of dataframes and graphs
In this talk, I will be explaining how to use Graphframes from Python, a new data structure in Spark 2.0 that takes the best parts of dataframes and graphs, with an example using personalized pagerank for recommendations.
Data Con LA 2022 - Real world consumer segmentationData Con LA
Jaysen Gillespie, Head of Analytics and Data Science at RTB House
1. Shopkick has over 30M downloads, but the userbase is very heterogeneous. Anecdotal evidence indicated a wide variety of users for whom the app holds long-term appeal.
2. Marketing and other teams challenged Analytics to get beyond basic summary statistics and develop a holistic segmentation of the userbase.
3. Shopkick's data science team used SQL and python to gather data, clean data, and then perform a data-driven segmentation using a k-means algorithm.
4. Interpreting the results is more work -- and more fun -- than running the algo itself. We'll discuss how we transform from ""segment 1"", ""segment 2"", etc. to something that non-analytics users (Marketing, Operations, etc.) could actually benefit from.
5. So what? How did team across Shopkick change their approach given what Analytics had discovered.
Territory Assignment Innovation: High-Velocity Techniques to Maximize Sales with Gusto’s CRO and Head of GTM Ops
Speakers: Tolithia Kornweibel, CRO @ Gusto and Jamie Edwards, Head of Go-to-Market Operations and Tools @ Gusto
1. The document discusses best practices for estimating projects and tasks. It emphasizes using ranges rather than specific numbers for estimates since estimation involves uncertainty.
2. Ten key principles of estimation are outlined, including always asking how the estimate will be used, not negotiating estimates, and using measured past performance to calibrate estimates. Aggregating independent estimates and decomposing work into around 15 tasks can improve accuracy by reducing risk.
3. Two short exercises are presented where participants estimate values and dates. Correct answers are then provided along with commentary on estimation techniques. The document promotes solving problems collaboratively and being transparent about assumptions in estimates.
Introductory presentation to Explainable AI, defending its main motivations and importance. We describe briefly the main techniques available in March 2020 and share many references to allow the reader to continue his/her studies.
AI-proof your career by Olivier Vroom and David WIlliamsonUXPA Boston
This talk explores the evolving role of AI in UX design and the ongoing debate about whether AI might replace UX professionals. The discussion will explore how AI is shaping workflows, where human skills remain essential, and how designers can adapt. Attendees will gain insights into the ways AI can enhance creativity, streamline processes, and create new challenges for UX professionals.
AI’s influence on UX is growing, from automating research analysis to generating design prototypes. While some believe AI could make most workers (including designers) obsolete, AI can also be seen as an enhancement rather than a replacement. This session, featuring two speakers, will examine both perspectives and provide practical ideas for integrating AI into design workflows, developing AI literacy, and staying adaptable as the field continues to change.
The session will include a relatively long guided Q&A and discussion section, encouraging attendees to philosophize, share reflections, and explore open-ended questions about AI’s long-term impact on the UX profession.
Dark Dynamism: drones, dark factories and deurbanizationJakub Šimek
Startup villages are the next frontier on the road to network states. This book aims to serve as a practical guide to bootstrap a desired future that is both definite and optimistic, to quote Peter Thiel’s framework.
Dark Dynamism is my second book, a kind of sequel to Bespoke Balajisms I published on Kindle in 2024. The first book was about 90 ideas of Balaji Srinivasan and 10 of my own concepts, I built on top of his thinking.
In Dark Dynamism, I focus on my ideas I played with over the last 8 years, inspired by Balaji Srinivasan, Alexander Bard and many people from the Game B and IDW scenes.
Slides of Limecraft Webinar on May 8th 2025, where Jonna Kokko and Maarten Verwaest discuss the latest release.
This release includes major enhancements and improvements of the Delivery Workspace, as well as provisions against unintended exposure of Graphic Content, and rolls out the third iteration of dashboards.
Customer cases include Scripted Entertainment (continuing drama) for Warner Bros, as well as AI integration in Avid for ITV Studios Daytime.
Title: Securing Agentic AI: Infrastructure Strategies for the Brains Behind the Bots
As AI systems evolve toward greater autonomy, the emergence of Agentic AI—AI that can reason, plan, recall, and interact with external tools—presents both transformative potential and critical security risks.
This presentation explores:
> What Agentic AI is and how it operates (perceives → reasons → acts)
> Real-world enterprise use cases: enterprise co-pilots, DevOps automation, multi-agent orchestration, and decision-making support
> Key risks based on the OWASP Agentic AI Threat Model, including memory poisoning, tool misuse, privilege compromise, cascading hallucinations, and rogue agents
> Infrastructure challenges unique to Agentic AI: unbounded tool access, AI identity spoofing, untraceable decision logic, persistent memory surfaces, and human-in-the-loop fatigue
> Reference architectures for single-agent and multi-agent systems
> Mitigation strategies aligned with the OWASP Agentic AI Security Playbooks, covering: reasoning traceability, memory protection, secure tool execution, RBAC, HITL protection, and multi-agent trust enforcement
> Future-proofing infrastructure with observability, agent isolation, Zero Trust, and agent-specific threat modeling in the SDLC
> Call to action: enforce memory hygiene, integrate red teaming, apply Zero Trust principles, and proactively govern AI behavior
Presented at the Indonesia Cloud & Datacenter Convention (IDCDC) 2025, this session offers actionable guidance for building secure and trustworthy infrastructure to support the next generation of autonomous, tool-using AI agents.
Longitudinal Benchmark: A Real-World UX Case Study in Onboarding by Linda Bor...UXPA Boston
This is a case study of a three-part longitudinal research study with 100 prospects to understand their onboarding experiences. In part one, we performed a heuristic evaluation of the websites and the getting started experiences of our product and six competitors. In part two, prospective customers evaluated the website of our product and one other competitor (best performer from part one), chose one product they were most interested in trying, and explained why. After selecting the one they were most interested in, we asked them to create an account to understand their first impressions. In part three, we invited the same prospective customers back a week later for a follow-up session with their chosen product. They performed a series of tasks while sharing feedback throughout the process. We collected both quantitative and qualitative data to make actionable recommendations for marketing, product development, and engineering, highlighting the value of user-centered research in driving product and service improvements.
In-App Guidance_ Save Enterprises Millions in Training & IT Costs.pptxaptyai
Discover how in-app guidance empowers employees, streamlines onboarding, and reduces IT support needs-helping enterprises save millions on training and support costs while boosting productivity.
Build with AI events are communityled, handson activities hosted by Google Developer Groups and Google Developer Groups on Campus across the world from February 1 to July 31 2025. These events aim to help developers acquire and apply Generative AI skills to build and integrate applications using the latest Google AI technologies, including AI Studio, the Gemini and Gemma family of models, and Vertex AI. This particular event series includes Thematic Hands on Workshop: Guided learning on specific AI tools or topics as well as a prequel to the Hackathon to foster innovation using Google AI tools.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
OpenAI Just Announced Codex: A cloud engineering agent that excels in handlin...SOFTTECHHUB
The world of software development is constantly evolving. New languages, frameworks, and tools appear at a rapid pace, all aiming to help engineers build better software, faster. But what if there was a tool that could act as a true partner in the coding process, understanding your goals and helping you achieve them more efficiently? OpenAI has introduced something that aims to do just that.
Crazy Incentives and How They Kill Security. How Do You Turn the Wheel?Christian Folini
Everybody is driven by incentives. Good incentives persuade us to do the right thing and patch our servers. Bad incentives make us eat unhealthy food and follow stupid security practices.
There is a huge resource problem in IT, especially in the IT security industry. Therefore, you would expect people to pay attention to the existing incentives and the ones they create with their budget allocation, their awareness training, their security reports, etc.
But reality paints a different picture: Bad incentives all around! We see insane security practices eating valuable time and online training annoying corporate users.
But it's even worse. I've come across incentives that lure companies into creating bad products, and I've seen companies create products that incentivize their customers to waste their time.
It takes people like you and me to say "NO" and stand up for real security!
Refactoring meta-rauc-community: Cleaner Code, Better Maintenance, More MachinesLeon Anavi
RAUC is a widely used open-source solution for robust and secure software updates on embedded Linux devices. In 2020, the Yocto/OpenEmbedded layer meta-rauc-community was created to provide demo RAUC integrations for a variety of popular development boards. The goal was to support the embedded Linux community by offering practical, working examples of RAUC in action - helping developers get started quickly.
Since its inception, the layer has tracked and supported the Long Term Support (LTS) releases of the Yocto Project, including Dunfell (April 2020), Kirkstone (April 2022), and Scarthgap (April 2024), alongside active development in the main branch. Structured as a collection of layers tailored to different machine configurations, meta-rauc-community has delivered demo integrations for a wide variety of boards, utilizing their respective BSP layers. These include widely used platforms such as the Raspberry Pi, NXP i.MX6 and i.MX8, Rockchip, Allwinner, STM32MP, and NVIDIA Tegra.
Five years into the project, a significant refactoring effort was launched to address increasing duplication and divergence in the layer’s codebase. The new direction involves consolidating shared logic into a dedicated meta-rauc-community base layer, which will serve as the foundation for all supported machines. This centralization reduces redundancy, simplifies maintenance, and ensures a more sustainable development process.
The ongoing work, currently taking place in the main branch, targets readiness for the upcoming Yocto Project release codenamed Wrynose (expected in 2026). Beyond reducing technical debt, the refactoring will introduce unified testing procedures and streamlined porting guidelines. These enhancements are designed to improve overall consistency across supported hardware platforms and make it easier for contributors and users to extend RAUC support to new machines.
The community's input is highly valued: What best practices should be promoted? What features or improvements would you like to see in meta-rauc-community in the long term? Let’s start a discussion on how this layer can become even more helpful, maintainable, and future-ready - together.
How Top Companies Benefit from OutsourcingNascenture
Explore how leading companies leverage outsourcing to streamline operations, cut costs, and stay ahead in innovation. By tapping into specialized talent and focusing on core strengths, top brands achieve scalability, efficiency, and faster product delivery through strategic outsourcing partnerships.
🔍 Top 5 Qualities to Look for in Salesforce Partners in 2025
Choosing the right Salesforce partner is critical to ensuring a successful CRM transformation in 2025.
Harmonizing Multi-Agent Intelligence | Open Data Science Conference | Gary Ar...Gary Arora
This deck from my talk at the Open Data Science Conference explores how multi-agent AI systems can be used to solve practical, everyday problems — and how those same patterns scale to enterprise-grade workflows.
I cover the evolution of AI agents, when (and when not) to use multi-agent architectures, and how to design, orchestrate, and operationalize agentic systems for real impact. The presentation includes two live demos: one that books flights by checking my calendar, and another showcasing a tiny local visual language model for efficient multimodal tasks.
Key themes include:
✅ When to use single-agent vs. multi-agent setups
✅ How to define agent roles, memory, and coordination
✅ Using small/local models for performance and cost control
✅ Building scalable, reusable agent architectures
✅ Why personal use cases are the best way to learn before deploying to the enterprise
2. CS 561 2
How do you find a solution in a large complex space?
• Ask an expert?
• Adapt existing designs?
• Trial and error?
3. CS 561 3
Example: Traveling Sales Person (TSP)
• Classic Example: You have N cities, find the shortest route such
that your salesperson will visit each city once and return.
• This problem is known to be NP-Hard
• As a new city is added to the problem, computation time in the classic
solution increases exponentially O(2n
) … (as far as we know)
Dallas
Houston
San Antonio
Austin
Mos Eisley
Is this the shortest path???
A Texas Sales Person
4. What if………
• Lets create a whole bunch of random sales people and see how
well they do and pick the best one(s).
• Salesperson A
• Houston -> Dallas -> Austin -> San Antonio -> Mos Eisely
• Distance Traveled 780 Km
• Salesperson B
• Houston -> Mos Eisley -> Austin -> San Antonio -> Dallas
• Distance Traveled 820 Km
• Salesperson A is better (more fit) than salesperson B
• Perhaps we would like sales people to be more like A and less like B
• Question:
• do we want to just keep picking random sales people like this and keep
testing them?
CS 561 4
5. We can get a little closer to the solution in polynomial time
• We might use a heuristic(s) to guide us in creating new sales
people
• So for instance, we might use the triangle inequality to help pick better potential
sales people.
• One can create an initial 2 approximation (at worst the distance is twice the
optimal distance) to TSP using a Nearest Neighbor or other similar efficient
polynomial time method.
• This detail is somewhat unimportant, you can use all kinds of heuristics to help
you create a better initial set of sales people [e.g. Match Twice and Stitch
(Kahng, Reda 2004)].
• Use some sort of incremental improvement make them better
successively.
• The idea is that you start with result(s) closer to where you think the solution is
than one would obtain at random so that the problem converges more quickly.
• Be careful since an initial approximation may be too close to a local extrema
which might actually slow down convergence or throw the solution off.
CS 561 5
6. However…………
• Sales person A is better than sales person B, but we can imagine
that is would be easy to create a sales person C who is even better.
• We don’t want to create 2n
sales people!
• This is a lecture about genetic algorithms (GA), <sarcasm> what
kind of solution will we use?</sarcasm>
• Should we try a genetic algorithm solution???
• Really? Are you sure? Maybe we should try something else
• It might be that you would prefer another solution
• I mean it might not be a bad idea
- You might learn something new
- However it might not be all the exciting
- I’m kind of not sure
- My mother suggested that I should do something else
- But at any rate I suppose you would like to get on with it
- Ok if you insist, but it’s all on your hands!
CS 561 6
Randomly inserted image
for no reason at all
7. CS 561 7
Represent problem like a DNA sequence
San Antonio
Dallas
Mos Eisely
Houston
Austin
Dallas
Houston
Mos Eisely
San Antonio
Austin
Each DNA sequence is a
possible solution to the
problem.
DNA - Salesperson A
DNA - Salesperson B
The order of the cities in
the genes is the order of
the cities the TSP will take.
8. CS 561 8
Ranking by Fitness:
Here we’ve created three
different salespeople. We
then checked to see how
far each one has to travel.
This gives us a measure of
“Fitness”
Note: we need to be able to
measure fitness in polynomial
time, otherwise we are in
trouble.
Travels Shortest Distance
9. Let’s breed them!
• We have a population of traveling sales people. We also know their
fitness based on how long their trip is. We want to create more, but
we don’t want to create too many.
• We take the notion that the salespeople who perform better are
closer to the optimal salesperson than the ones which performed
more poorly. Could the optimal sales person be a “combination” of
the better sales people?
• We create a population of sales people as solutions to the problem.
• How do we actually mate a population of data???
CS 561 9
10. CS 561 10
Crossover:
Exchanging information through some part of information (representation)
Once we have found the best sales
people we will in a sense mate
them. We can do this in several
ways. Better sales people should
mate more often and poor sales
people should mate lest often.
Sales People City DNA
Parent 1 F A B | E C G D
Parent 2 D E A | C G B F
Child 1 F A B | C G B F
Child 2 D E A | E C G D
Sales person A (parent 1)
Sales person B (parent 2)
Sales person C (child 1)
Sales person D (child 2)
11. Crossover Bounds (Houston we have a problem)
• Not all crossed pairs are viable. We can only visit a city once.
• Different GA problems may have different bounds.
CS 561 11
San
Antonio
Dallas
Mos Eisely
Houston
Austin
Dallas
Houston
Austin
San
Antonio
Mos Eisely
Dallas
Houston
Mos Eisely
Houston
Austin
San
Antonio
Dallas
Austin
San
Antonio
Mos Eisely
Parents Children
Not Viable!!
12. TSP needs some special rules for crossover
• Many GA problems also need special crossover rules.
• Since each genetic sequence contains all the cities in the travel,
crossover is a swapping of travel order.
• Remember that crossover also needs to be efficient.
CS 561 12
San
Antonio
Dallas
Mos Eisely
Houston
Austin
Dallas
Mos Eisely
Houston
San
Antonio
Austin
San
Antonio
Dallas
Houston
Austin
Mos Eisely
Parents Children
Viable
Dallas
Houston
Austin
San
Antonio
Mos Eisely
13. What about local extrema?
• With just crossover breading, we are constrained to gene
sequences which are a cross product of our current population.
• Introduce random effects into our population.
• Mutation – Randomly twiddle the genes with some probability.
• Cataclysm – Kill off n% of your population and create fresh new
salespeople if it looks like you are reaching a local minimum.
• Annealing of Mating Pairs – Accept the mating of suboptimal pairs with
some probability.
• Etc…
CS 561 13
14. CS 561 14
In summation: The GA Cycle
Fitness
Selection
Crossover
Mutation
New Population
15. GA and TSP: the claims
• Can solve for over 3500 cities (still took over 1 CPU years).
• Maybe holds the record.
• Will get within 2% of the optimal solution.
• This means that it’s not a solution per se, but is an approximation.
CS 561 15
16. GA Discussion
• We can apply the GA solution to any problem where the we can
represent the problems solution (even very abstractly) as a string.
• We can create strings of:
• Digits
• Labels
• Pointers
• Code Blocks – This creates new programs from strung together blocks
of code. The key is to make sure the code can run.
• Whole Programs – Modules or complete programs can be strung
together in a series. We can also re-arrange the linkages between
programs.
• The last two are examples of Genetic Programming
CS 561 16
17. Things to consider
• How large is your population?
• A large population will take more time to run (you have to test each
member for fitness!).
• A large population will cover more bases at once.
• How do you select your initial population?
• You might create a population of approximate solutions. However, some
approximations might start you in the wrong position with too much
bias.
• How will you cross bread your population?
• You want to cross bread and select for your best specimens.
• Too strict: You will tend towards local minima
• Too lax: Your problem will converge slower
• How will you mutate your population?
• Too little: your problem will tend to get stuck in local minima
• Too much: your population will fill with noise and not settle.
CS 561 17
18. GA is a good no clue approach to problem solving
• GA is superb if:
• Your space is loaded with lots of weird bumps and local minima.
• GA tends to spread out and test a larger subset of your space than many
other types of learning/optimization algorithms.
• You don’t quite understand the underlying process of your problem
space.
• NO I DONT: What makes the stock market work??? Don’t know? Me neither!
Stock market prediction might thus be good for a GA.
• YES I DO: Want to make a program to predict people’s height from
personality factors? This might be a Gaussian process and a good candidate
for statistical methods which are more efficient.
• You have lots of processors
• GA’s parallelize very easily!
CS 561 18
19. Why not use GA?
• Creating generations of samples and cross breading them can be
resource intensive.
• Some problems may be better solved by a general gradient descent
method which uses less resource.
• However, resource-wise, GA is still quite efficient (no computation of
derivatives, etc).
• In general if you know the mathematics, shape or underlying
process of your problem space, there may be a better solution
designed for your specific need.
• Consider Kernel Based Learning and Support Vector Machines?
• Consider Neural Networks?
• Consider Traditional Polynomial Time Algorithms?
• Etc.
CS 561 19
Editor's Notes
#2: What has been traditionally done in the heat exchanger world is to take a best “guess” at a design with the help of an expert. Traditional heat exchanger designs are usually fully mathematically described.
This initial guess can then be used as a basis, and various parameters are “tweaked”. The performance of the heat exchanger can be recalculated to see if the modified design is an improvement on the original one.
Surely there must be a better way? There is - we don’t need to look very far to find examples of optimisation in Nature.
#7: One the fitness has been assigned, pairs of chromosomes representing heat exchanger designs can be chosen for mating.
The higher the fitness, the greater the probability of the design being selected. Consequently, some of the weaker population members do not mate at all, whilst superior ones are chosen many times.
It is even statistically possible for a member to be chosen to mate with itself. This has no advantage, as the offspring will be identical to the parent.
#8: One the population has been formed, (either randomly in the initial generation, or by mating in subsequent generations), each population member needs to be assessed against the desired properties - such a rating is called a “fitness”.
The design parameters represented by the zeros and ones in the binary code of each chromosome are fed into the mathematical model describing the heat exchanger. The output parameters for each design are used to give the fitness rating. A good design has a high fitness value, and a poor design a lower value.
#10: The mating process is analogous to crossover carried out in living cells.
A pair of binary strings are used. A site along the length of the string is chosen randomly. In this example it is shown between the 6th and 7th bits, but it could be anywhere.
Both members of the pair are severed at that site, and their latter portions are exchanged. Two parents form two children, and these two “daughter” designs become members of the population for the next generation.
This process takes place for each pair selected, so the new population has the same number of members as the previous generation.
#14: Summary of the previous steps to the model.
Populations are continuously produced, going round the outer loop of this diagram, until the desired amount of optimisation has been achieved.