Author Spotlight
In the Author Spotlight series, TDS Editors chat with members of our community about their career path in Data Science, their writing, and their sources of inspiration. Today, we’re thrilled to share our conversation with Ani Madurkar.
Ani is a Senior Data Scientist who is passionate about the art and science of machine learning & AI. He’s especially interested in applications for health, environmental, and sports science. Building data science products to empower communities and strengthen networks is where his mind wanders for fun.
When not working on projects, Ani enjoys biking around, doing landscape and street photography, reading philosophy, and Writing about his thoughts on Medium.
How did you decide to become a data scientist?
I found my way into data science through a series of crushing failures. I originally wanted to become a physician, specializing in neuroscience, but I had failed the MCAT twice, which left me quite lost at the time. Graduating with a Philosophy degree, I felt quite unqualified to have any valuable skills to solve problems for others.
My research at the time exposed me to data analysis, and so I figured I’d give that career a shot—mainly because it sparked my curiosity. Starting from scratch, I taught myself SQL and was able to land a Data Analyst role. This initial step broadened my sights into the data world and my curiosity grew to an insatiable level.
I was able to quickly master data analysis techniques and wanted to become a data scientist as I felt it more naturally fit who I was. This required a myriad of skill sets and experience that I did not have at the time. So I taught myself Python, mastered foundations in statistics and linear algebra, and enrolled in a Master of Applied Data Science program at the University of Michigan while working and leading communities of practice.
This work largely felt like my calling, as opposed to medicine. The concepts, from programming to math, came to me quite naturally and I was able to learn a large volume of information at a rapid pace. After graduating, I quickly found myself building enterprise machine learning systems to bring value to high-profile business projects.
What challenges did you face once you entered the field—and how did you approach them?
There are more details to my journey, but the fundamental concept is that it’s rife with failure and uncertainty. For most of it, I’ve felt completely out of my depth and fighting an aggressive uphill battle.
I approached these with three mental frameworks: insatiable curiosity, relentless drive, and collaborative growth. I was lucky to find a craft that fully piqued my fascination and was filled with theory and applications that I was deeply interested in.
To achieve exponential growth for yourself and for those around you, you should be doing it collaboratively.
My drive comes primarily from knowing how much my parents sacrificed to give me a healthy and happy life and how close I was to throwing it away because I took too many things for granted. I never wanted to go back to that version of myself, so to do them and myself proud, I keep persevering.
Finally, I believe that if you work as hard as you can but do it alone, you’ll only achieve linear growth at best. To achieve exponential growth for yourself and for those around you, you should be doing it collaboratively. I wanted to be someone who used my skills and knowledge to help those around me, even if I wasn’t the best at the current moment, and that’s a principle I carry with me everywhere. Even with all that I’m currently doing, I always have time to help.
How do you choose the data-focused projects you work on?
I mainly find myself drawn to Bayesian methods and Graph ML projects these days. I think those two methods are not only interesting ways to solve problems, but incredibly fascinating ways to look at the world. I think there is some cutting-edge work being done in Multi-Task Learning and Reinforcement Learning as well, but I haven’t found many industry-ready applications for those just yet.
In the next few months, I want to specialize further in building machine learning systems and not just machine learning models. TensorFlow has a great infrastructure to work on building systems that live far beyond a Jupyter Notebook. I think most organizations have enough people who can read sklearn docs and a handful of articles to do a basic machine learning setup in Jupyter, but there is still a large need to have specialists who know how to create value at scale from those models. MLOps feels a lot like the wild west right now, but I’ve come to love that uncertainty.
What prompted you to start writing publicly about your career and other data-related topics?
I knew I wanted to create my own brand for a while now, but I wanted to get a better footing in this craft before starting. I have been doing photography for almost seven years now, so I’ve been really enticed by creating an online presence through a new art form. I started during my Capstone project for my Master’s program, and initially established what kind of voice I wanted to have online before writing.
I noticed there was a large amount of content for beginners and a large amount of content for researchers, but it felt scattered and haphazard for the intermediate specialists looking to grow effectively and efficiently. I wanted to fit this niche, so I chose to do it by talking about methods and techniques in an applied way that discussed just enough theory to work on your own projects.
The intermediate-to-advanced specialists don’t need to be inundated with theory or to only have code snippets – they need just enough of both. So in a number of my pieces that walk through technical concepts, I offer just enough theory and pair it with code being applied to a dataset to guide a well-rounded understanding.
Also, I’ve always had a passion for teaching as I come from a family who used to be teachers in their past lives. I figured that writing about my thoughts and learnings was the fastest and most effective way to share what I have and learn from others as well.
In your writing on TDS and elsewhere, you blend together several different interests. How do you go about choosing topics for your posts?
I start with the assumption that if I’m struggling with or trying to understand something, it’s likely my peers are, too. I spend a lot of time reading about the problem, concept, or technique and then attempting to put my thoughts on it on paper. I am generally a person whose mind is always racing, so there’s a never-ending stream of things I’m curious about. This creates a flowing content engine that encompasses everything in the art, science, and philosophy of machine learning.
Identifying your values and identity gives your content a personality; it gives people something to connect to.
I try to balance my writing by having a piece for others and having a piece for myself as much as I can. I’ll read a lot of blogs or articles, which gives me a good sense of what people are trying to learn or are struggling with, and then I’ll also have my own interest in things I want to learn and am curious about. I’ll try to do one for the readers and one for myself, and sometimes balance both needs in one story. Ultimately, though, I keep a sharp focus on what will provide someone with value.
Do you have any advice for people who might want to write about their work, but aren’t sure where to start or how to find the time?
I think it’s imperative to first start with knowing who you are. What things interest you? What kind of work are you most proud of? Identifying your values and identity gives your content a personality; it gives people something to connect to.
When I think about creating content, I think about it in three stages:
- Creating content for entertainment. This is base form as it may just inform of a new technique, or summarize a book or course, or clarify some concepts. The best of this stage can be informative and informational. This content generates views.
- Creating content with authenticity. This is when you add a flair of your own thoughts, ideas, and effort into the story. It’s great you can talk about a new modeling technique in a book, but even better if you can add an anecdote of you trying it out and how it went, or contribute to the source material with your own voice. Adding your personal touch to content creates stories that people truly engage with. This content generates followers.
- Creating content based in your values. The final form of content creation is when you identify what your values are, what you believe in, what you stand for, and then publicly reinforce them in various ways. Do you want to be someone who champions the ethical practice of AI? Write about that in your stories. As you talk about a new ML model, do research and talk about what kinds of new concerns it may bring to society. This is hard mainly because it’s difficult to sit down and know what your values are. It’s okay to get things wrong and edit yourself along the way; start somewhere, fail, learn, improve, and iterate endlessly. This content generates partners.
How do these stages translate to your writing workflow?
There’s nothing wrong with just creating content for entertainment or informational purposes, but know that creating your own brand is far more than regurgitating examples in package docs.
As for finding time to write, I think it can be tough and varies depending on what you want to write and how easily writing comes to you. If you only want to write about project-focused stories that have code in them, you may need a lot of upfront time to get things set up.
Typically, I try to balance my writing with an adequate amount of reading. I always try to have a new book, article, or blog post with me (on my phone mainly) so I can be constantly reading, even if it’s a little bit at a time. The reason for this is it keeps my brain thinking about things in ways I wouldn’t have otherwise. I’m not arguing to have zero idle time, which is incredibly useful to have for creativity. I’m advocating for creating your own content engine. Create ways for inspiration to come to you instead of you going to it.
The more I read, the more surface area I give myself to create new stories. Most of my time is not spent feeling super motivated and inspired; most of my time is spent feeling the opposite, in fact. And in those moments, it’s best to read. So then, when inspiration and energy strikes, I’m ready to put my thoughts down and write.
Looking towards the next year or two, what kinds of changes do you hope to see in data science as a field?
I’m a hyper-techno optimist so there’s a lot I hope to see in the future, but I’ll start with maybe the most controversial. I hope to see better-scoped machine learning projects. There are a lot of reasons ML projects don’t go according to plan today, but I feel too many of them are due to them being led or proposed by people who may not fully understand machine learning and how to actively make data useful. Having people who have not done the work lead large-scale projects like these can be quite dangerous, as it leads to inaccurate timelines, burnt-out scientists and engineers, sub-quality systems built, and more.
There may be a lower barrier of entry to learning these concepts, but that should not be confused with a low barrier of excellence. It’s a difficult craft to do right and, by consequence, a difficult craft to lead. The best projects I have ever worked on were led by scientists/engineers who have spent most of their career doing statistics or machine learning and evolved to learn the business. Obviously, that is a rarity, so I hope the future brings a more sensible and attainable approach to constructing projects with a high probability of success.
Our ability to connect with each other and diversify the table where decisions are made is going to be a necessary step to creating systems that help more than hurt.
I’m also terribly excited about data-centric ML. A laser focus on quality data is going to be an imperative step for the future of data science and tech. Whatever challenges we’re facing today with data (and there are a lot of them) are going to get exponentially harder as we start to become a technocentric society.
Technology has been our chosen weapon to combat society’s toughest problems such as climate change, public health, etc. Whether we like it or not, we are heading toward a future where we have sensors, cameras, virtual assistants, and more integrated seamlessly into society. This brings about a plethora of ethical issues and data issues. This will require us to think far past Jupyter notebooks, and that is not a one-person job. We’ll need to become better about handling and deploying data at scale to deal with these new problems, and our best defense is our collaborative and collective nature. Open source, virtual communities, and broad sharing of knowledge are crucial parts of our ability to successfully build machine learning and AI systems to solve our most challenging issues.
The field is evolving at a rapid pace with new models, tools, and papers released each day. Our ability to connect with each other and diversify the table where decisions are made is going to be a necessary step to creating systems that help more than hurt. I’m deeply looking forward to work on solutions for complex problems with more people that inspire me.
To explore more of Ani’s work, head over to his Medium profile, or follow him on Twitter. Or check out some of his recent TDS posts:
- AutoML and the Future of Data Science
- Invisible Skills That Distinguish Expert Data Scientists
- 5 Data Science Trends in the Next 5 Years
To support the work of TDS authors and to gain unlimited access to our archives, consider becoming a Medium member today.
Note: this interview was lightly edited for length and clarity