Designing Data-Intensive Applications by Martin Kleppmann was not a quick-read. Let me be clear, it is not such a long book (the paper version is 400 pages), but it is so dense of information that takes some time to go through. The book covers indeed a broad spectrum of data …
read moreModels of Data Science teams: Chess vs Checkers
How many data engineers should we hire? Are they too many compared to our data scientists?
One of the key decisions to take when building a data science team is the mix of roles. This means choosing the right mix of background and of activities that each member of the …
read moreChoosing my next job title (in a data science career)
I'm now part of a data and AI team in a fintech spinoff. When I joined the company, it did not make sense to spend time in defining precise job titles because we were to build everything from scratch (both software, teams and organization). My job title was therefore a …
read moreWhat we expected from Covid on March 10th
The first Covid case in Italy was found on February 21st 2020. A couple of weeks later we were entering the lockdown with this number of new daily cases.

The number of Covid-19 new cases was growing really fast every day. We had no clue about what was going to …
read moreSummary: Building AI Solutions with Azure ML
While studying for the Azure Data Scientist Associate certification, I took notes from Building AI Solution with Azure ML course. In this single page, you'll find the entire content of the course (as of 18th August, 2020). This page is a small support for those preparing for earning the certification …
read moreError when restarting Databricks streaming job
This is an error I encountered when I have a Spark Streaming job running on Databricks 6.1. Consider the case I have to update a running streaming query. Databricks recommends to always start (and restart too?) a streaming query on a new dedicated cluster. However, in some scenario you …
read moreNew Work on atacmonitor.com

My side project atacmonitor features a new guise. Data is now being collected for all bus and tram lines in Rome. Data pull is achieved via Python functions running on AWS Lambda. Data is then stored in MongoDB hosted in MongoDB Atlas. Atlas also provides the charts in the page …
read moreThe Pragmatic Programmer [Highlights]
read moreRather than construction, software is more like gardening— it is more organic than concrete. You plant many things in a garden according to an initial plan and conditions. Some thrive, others are destined to end up as compost. [...] You constantly monitor the health of the garden, and make adjustments (to …
6 Take-Aways after Reading "The Signal and The Noise"
The Signal and The Noise by Nate Silver is a must-read book for those interested in predictions. It is not a technical book. You will not learn any algorithm. However, it presents a series of real-world scenarios when predictions did work and where predictions did not work. The book is …
read moreMy Talk about Superset [Python Milano Meetup]
Yesterday, I gave a talk Python Milano Meetup. The Meetup was designed as Python pills: three 20-minutes talks in a row. The talks:
- Superset: data visualization at AirBnB - Marco Santoni
- Java Vs Python - Cesare Placanica
- pdb in action - Lorenzo Mele
read moreVery nice talk of @Airbnb #Superset with @MrSantoni at #PythonMilano …