Steph Locke on building a solid foundation in data science
We spoke to Steph Locke about how much experience is needed to build a solid foundation in data science and how to future-proof your tech skills.
Steph Locke is something of a game-changer in the data science industry. She is one of just three individuals in the world to be recognised with Microsoft’s Most Valued Professional award for AI and data platforms. This year, she was also dubbed the Most Innovative Woman in Artificial Intelligence at the Influential Businesswoman Awards.
Locke spoke to Siliconrepublic.com about her data science consultancy business Locke Data, and the annual Data Science Bootcamp she’s currently leading in Dublin’s Talent Garden. ‘English is still the primary language of technology and that means that there’s going to be huge demand in Dublin for data science over the next few years’
Tell us a bit about your background. I am data scientist and I currently run two businesses, Locke Data and Nightingale HQ. Locke Data is my consultancy and training capability, and at Nightingale we’re looking to help the whole world of business adopt artificial intelligence.
Data science and AI is a big passion of mine. I started out helping someone move from tallying sales on a Post-it to looking at a screen, to automating that and building forecast models for how we can do stuff more effectively in marketing. I’ve gone from there, always looking at how data and predictive capabilities can help businesses. It’s always moving more to the technical side for how I deliver things, but always very much about how I can change businesses.
I spend a vast amount of my free time at conferences presenting, I do a lot of technical community work. I organise conferences, I do mentoring, I do lots of presenting, I write books when I can, that kind of thing. I’m working on how to not overwork quite so much at the moment too now!
Can you tell us about Locke Data’s role in Ireland? As a consultancy, our focus is on helping businesses start doing data science. It’s a big cultural and technical shift as you start thinking about questions like, “How can I use the information we already have to better improve this process?”
I help businesses start addressing those kind of questions. In Ireland, there’s a couple of businesses based in the country that are very, very good at data science. You’ve got Salesforce based in Dublin, but most businesses are still going through the first question about how to use the abundance of data that they have to help their business grow.
As one of the previous hosts of the Talent Garden Data Science Bootcamp once pointed out, “Pretty soon, Ireland’s going to be the only English-speaking country in the EU”. English is still the primary language of technology and that means that there’s going to be huge internal demand in Dublin for data science over the next few years.
Then there’s going to be that external demand for qualified Irish data scientists who can plod into any country and be able to engage with this.
What’s the main aim of the Talent Garden Data Science Bootcamp? A solid, practical foundation in data science.
There’s different courses in the market and there’s different backgrounds for data scientists. You’ve got people coming from academia who are very good at the statistics and building something that focuses really well on what’s happening. Their goal, as an academic data scientist, is to find out why something happened. Not necessarily to change how it happens, but by exposing what happened.
Then we’ve got programmer data scientists and developers who are used to building software. They start trying to improve their processes by adding machine learning. They know how to do it from a code-heavy perspective, but they don’t necessarily end up with the best method, or being able to say that something actually represents reality. It just theoretically looks as though it’s improving things.
The aim of the Bootcamp is to give somebody the statistical underpinning. Obviously we can’t do a PhD in 12 weeks of lectures, but we can give somebody an understanding of how to draw conclusions, basically, and what the different types of algorithms that you can use do and what their kind of pros and cons are.
We’re giving them that solid data science foundation because they can base a lot of learning off of that. We’re also making it practical as well, so they learn how to code and how to get these things into production, into processes, so that they’re actually making a difference.
Who can benefit most from the Bootcamp? We’ve got three key audiences. One is that academic data scientist, who needs to understand a bit more of the practical. Then we’ve got the developer-first data scientist, who may be a little bit weak on the statistical foundation, and then we have people from data-heavy business functions such as finance, business intelligence, people who have a strong understanding the business processes and are looking to help improve them using machine learning.
What’s the minimum foundation in data science that someone would need to benefit from the programme? Some Excel knowledge. We’ve given people pre-reading of two books that I have written. They are kind of teach-yourself-at-home books. One of those books, the first one, teaches people with Excel experience how to understand data types and we take people through the code type of thing that people might need. As they go through the course, they’ll have support all along the way through the Bootcamp Slack group, where they can communicate with others who have complementary experience as developers or academics. They’ll be able to grow their skills and fix any weaknesses that they start out with.
Where should a data scientist start when it comes to future-proofing their skills? A data scientist should definitely be able to code. The first language you learn is always the hardest. At this point it doesn’t matter whether you learn Python, Julia or whatever. There’s a bunch of different languages that can work. Be aware of what each language can potentially bring, and their capabilities.
Then from the machine learning side, have your business data – like the data everyone has about how tweets are performing, who your customers are and what your processes are doing. That data is here to stay. That data is never going to go away. It doesn’t matter what industry you work in.
To future-proof the skills, to help them become increasingly more relevant on the maths side, people should be looking at the newer techniques around deep learning and reinforcement learning, where these are helping us work on unstructured data like video and images, that sort of thing.
Anything else to add in relation to the Bootcamp? I think the Bootcamp is an incredibly cost-effective way to trial whether data science is a good fit for them and their staff. Particularly, this first iteration – it’s the cost of two or three days of on-site training inside of a business. For this price, you get 18 weeks of training for one member of staff from Talent Garden. It’s going to have a tremendous return on investment for any business that sponsors a team member.