New Open Source Projects, SQL Interview Tips, Math and Statistics Course with Julia

New Open Source Projects, SQL Interview Tips, Math and Statistics Course with Julia

Happy Tuesday, Friends!

First, I wanted to thank you all for subscribing to this newsletter!

I started this newsletter four weeks ago as an experiment aiming to consolidate the regular posts I share on LinkedIn and other social media into a weekly newsletter. Four weeks into this project, based on feedback I received and my learning, I realized that having too many topics in one place defeats the purpose of having a concise weekly summary (and challenging to maintain on a weekly basis 😅). Therefore, I decided to remove the MLOps and the forecasting sections from the newsletter and focus on core data science topics (open source, learning resources, books, tools, etc.). I feel each one of those topics (MLOps and forecasting) deserves its own love ❤️ and attention 😎. Therefore, I plan to create a separate monthly newsletter for time series forecasting and another one for MLOps topics. Please stay tuned!

As always, feedback is welcome! 🙏🏼

⭐️ Daily updates on 👉🏼 Instagram, Threads, and Facebook ⭐️


Open Source Weekly

Here are two new open-source projects I came across this week.

DoubleML Coverage

The DoubleML is an open-source project for Causal Machine Learning implementation of the double/debiased machine learning for treatment and structural parameters paper by  Chernozhukov et al. (2018). The project has both Python and R APIs.

Article content
Conditional VaR Plot; Image credit: library documentation

The DoubleML Coverage is a new open-source project by the DoubleML team that provides coverage simulations for the DoubleML library. Additional information is available in the release post by Philipp Bach :

RandomWalker

The second project is the RandomWalker - an R library by Steven Sanderson and Antti Rask. As the name implies, the library provides applications for creating random walks using different methods.

Article content
Example of random walks simulation; image credit: library documentation

The library follows the tidyverse workflow, and version 0.1.0 is available on CRAN.


New Learning Resources

Here are some new learning resources that I came across this week.

Python in Excel

A new course by Helen Wall focuses on the integration between Excel and Python. The course provides tools for using and embedding Python applications in Excel. This includes calling Python functions and libraries from Excel and applying algorithms and calculations on Excel using Python on the backend. The course is part of a learning path that focuses on Python applications with Excel.

For more details, please check Helen's release post.

GenAI Agents Tutorials

The GenAI Agents repository by Nir Diamant is a new project that provides a list of curated tutorials for the development and implementation of GenAI agents.

This includes beginner-level tutorials such as setting up a conversational agent with context awareness and data analyst agent to advanced tutorials such as multi-agent collaboration system agent 🎯.

Advanced NumPy Course - Vectorization, Masking, Broadcasting & More

This one-hour tutorial by NeuarlNine focused on NumPy's advanced functionality. This includes the following topics:

  • Broadcasting
  • Vectorization 
  • Masking 
  • Advanced indexing

SQL Mistakes

Are you preparing for SQL interviews? Here is a great summary by Vaishali Macwan of common SQL mistakes when using CTEs, joins and window functions.

MATH2504, Programming for Mathematicians with Julia

If you are looking for resources to learn core math and statistics with Julia, the University of Queensland is a great place to start with. The UQ MATH2504 course from the University of Queensland provides an introduction to programming and software architecture for mathematics, including statistical and data analysis applications with Julia programming. The full course is now available on YouTube and being taught by Dr. Paul Bellette and Dr. Claire Foster and other guest lectures. Big shout out to Prof. Yoni Nazarathy for sharing great resources for learning statistics and math with Julia!

Book of the Week

This weekend, I had the chance to dive into Jonathan Regenstein 's new book co-authored with Prof. Sudheer Chava and Prof. Emmanuel Alanis - A Practical Guide to Macroeconomic Data with Python. The book also has an R version, and as the name implies, it focuses on analyzing macroeconomic data using R and Python 🎯.

Article content
Plotting the violin of percentage change to GDP components with Python

The book covers the following topics:

  • Working with financial and macroeconomic data sources such as FRED and Yahoo Finance APIs
  • Analysis techniques for macro data like GDP, inflation, interest rates, employment, and market data
  • Data visualization techniques

I find that learning while coding is more intuitive and similar to Jonathan's previous book - Reproducible Finance with R, the new book is one that I wish I had while taking my micro and fiance courses during my Master's program.

The book's online version is available for purchase on the book's website, and a hard copy is expected to be released in a few months.

More details are available on the book website:


Meme of the Week

The meme of the week goes to Dylan Anderson for the following post:

Most likely based on many people's true stories 🤣.


See you next Tuesday!

Thanks,

Rami

⭐️Join my Data Science Channel for daily updates⭐️


Abdul Manan

Data Engineer Specialized in Building and optimizing scalable data pipelines using Azure Data Factory, Azure data bricks and Snowflake | ETL & ELT | Spark & PySpark | Azure Synapse analytics | Airflow | Data warehouse

7mo

Rami Krispin Thta's great

Like
Reply

Great work! Impressed with depth and breadth of content that is also relevant and useful…I am backlogged already from a few items from initial newsletter I want to dig into more..thank you for all the work going into this

Like
Reply
Audra Bloom Knappe

Accountant turned finance data scientist. Driven by curiosity.

7mo

Wow dedicated threads for ML and time series, so much work and we appreciate it! 🙌

Like
Reply
Vaishali Macwan

Senior Data Scientist | Ex - Amazon | Data Science, Experimentation | M.Sc - Statistics

7mo

Thank you so much for mentioning my article. It’s an absolute honor! Rami Krispin

Like
Reply

To view or add a comment, sign in

More articles by Rami Krispin

Insights from the community

Others also viewed

Explore topics