What does "practical" mean to you? At TDS, we try to keep the definition loose. The articles we share with you every day have to offer a distinct value: something you can make use of in your day-to-day lives. But a high-level theoretical explanation or a personal reflection about work and identity can be just as useful – just as practical, even! – as a well-executed tutorial. This week’s Variable is a case in point.
We start with a post from Ryan Sander: a first-principles introduction of Gaussian process regression (GPR), the powerful class of machine-learning algorithms that data scientists turn to for a wide variety of projects. A very good place to go from there? Maxim Ziatdinov‘s panoramic look at rotationally invariant variational autoencoders (rVAE) and their ascendancy among those who analyze imaging data for a living. It provides history and context – and several practical examples to consider.
We read posts like the two above in order to learn and grow, yet sometimes – more often than many might think – even well-read, seasoned data scientists experience imposter syndrome. Dale Markowitz shares insights from her own personal experience with this cluster of feelings and gives readers several helpful tools to navigate it.
One of the points that Dale makes is that it’s not just fine, but rather inevitable that we stumble upon our own knowledge gaps. This is especially true in Machine Learning, where the model-deployment process relies on numerous small steps where many a thing can go wrong. We have a soft spot for posts that help fill in these types of specific, well-scoped gaps. If you do too, you’ll appreciate Valerie Carey‘s foray into feature choice and its effects on model fairness, and Jacopo Tagliabue‘s walkthrough of a tool that automatically generates documentation for machine-learning pipelines. You also won’t want to miss Dr. Sohini Roychowdhury‘s U-net based tutorial on transfer learning for multi-class image segmentation.
Rounding out this week’s lineup of Editors’ Picks? A London Special: Philip Wilkinson‘s post on spatial clustering uses crime data as the basis for several neat visualizations of the city. (If you’re inspired to rewatch Sherlock once you’re done reading, we can’t blame you.)
The art and practice of connection
It’s a common trope by now to suggest that data scientists need to be strong storytellers. There’s a bit less talk about the fact that these stories often live as a piece of writing – whether it’s a slide deck, a scholarly article, or – why not? – a TDS post (feel like writing one?). Editor Elliot Gunn has collected some of our best resources for helping technical experts produce solid writing, and while they offer many different perspectives, they share an important truth: "writing can feel harder than coding."
Research data scientist and TDS Editorial Associate Lowri Williams agreed with this sentiment in our recent Q&A: "I wouldn’t say it’s easier to write a blog post either. To write clearly and consistently is sometimes challenging!" But the benefits are well worth it; read the rest of our conversation to hear more from Lowri about her career path and current work, where she makes the most of her academic expertise to help small and mid-sized business owners.
The crucial role of communication and the need to both build and channel a sense of safety around AI is also top of mind for our recent podcast guest Ethan Perez. Listen to his conversation with host Jeremie Harris to learn more about his work around debate-based strategies for AI (and a lot more!).
Time is never not precious; it’s even more so in a week when many of us gave back an hour to the clocks changing. So we feel particularly grateful for the chunks of it that you choose to spend with us and our writers on TDS, and for the other forms of support you practice – from becoming a Medium member to sharing our work with your networks to giving us your feedback.
Until the next Variable, TDS Editors
Recent additions to our curated topics:
Getting Started
- The Ultimate Guide to Cracking Business Case Interviews for Data Scientists: Part 1 by Emma Ding
- Why Decorators in Python Are Pure Genius by Rhea Moutafis
- Getting Started with GitLab: The Absolute Beginner’s Guide by Marie Lefevre
Hands-On Tutorials
- Custom Audio Classification with TensorFlow by Pascal Janetzky
- Ten Advanced SQL Concepts You Should Know for Data Science Interviews by Terence Shin
- How to Apply Transformers to Any Length of Text by James Briggs
Deep Dives
- Selecting Hyperparameter Values with Sequential, Human-in-the-Loop, Search Space Modification by Hai Rozencwajg
- (Deep) House: Making AI-Generated House Music by Taggart Bonham
- Classifying Simple Color-Matching Outfits with the Help of Fuzzy Logic by Fernando Carrillo