Welcome to today’s vlog featuring Catapults Director for Data and AI Sid Atkinson and Principal Data Scientist Lee Harper. Today Lee H. and Sid A. will be discussing Pirates vs Navies analogy and how it applies to the data science teams.

 

S: Hello everyone my name is Sid Atkinson I am Catapults Director for Data and AI with me today is Lee Harper and today we’re going to talk about the first in series of ML engineering and go into why are the analogies in pirates and navies and how does this apply to the data science teams today. This conversation and subsequent conversation on Mlops are just part of a larger conversation of Data and AI culture.

L: My name is Lee Harper Principal Data Scientist at Catapult Systems. I lead the data section of the Data and AI practice.

S: So Lee I gave a small hint of the whole pirates vs navies thing which is an analogy I think if you are in the data and science space and is well known for those who are coming from the outside or new what fun are we having with that term why do we say that.

L: Yarr! Data Scientists come from academia or analyst backgrounds where very often we are looking to solve problems. That really is the key to finding the traits of a Data Scientist. That can be using many technologies or a range of tools. But often you get to a stage where you have built a prototype something that works you’ve solved the problem. We tend to get in our zone either alone or as part of a small team and this has a word like being pirates. We do a lot with a limited set of tools and where we can snuggle is being the sole operator to the larger scale of the organization. That’s where we start to think ok so we are these great pirates these loan wolves but how do we go integrating what we are doing with the organization how do we bring that knowledge that tooling to people who aren’t Data Scientists and do it in a robust way.

S: So when we look at this there can be a love affair with being the pirate it can be a romanticized notion.

L: Oh we love it.

S: How do we keep the pirate spirit but operate in a way that Data Scientist are functioning in this part of a good citizen role.

L: A big part of this is the collaboration of this with other skill sets you know most data scientist are not hardcore engineers either by temperament or by training. And yet all those things are required to bring the data science work to the organization you know you have people doing infrastructure, doing dev-ops doing software engineering, doing integration, security. I don’t know anything much of those topics more than the basics. So, that’s the idea of being part of this bigger navy navies contain many roles, many kinds of ships each of which independently this not useful but not necessarily very effectively but together as a single unit they can do incredible things.

S: Excellent, so in this topic what are some things that we will explore today’s conversation is just the first of several we are focusing on in MLops and engineering. What are other aspects in building this navy that we will discuss in some of our future sessions?

L: Sure, so when we are digging in why this is a hard problem to solve. Because very often it sounds like it should easy.. just scale. Why is scaling machine learning really hard? Understanding that is really useful. And once you have that covered we can start to have a few components into actually built a solution. How do you integrate machine learning models with dev ops? How do you deal with version control with only a code but with also data? How do you go about showing your models are explainable that you can reproduce results again, again and again. All those are interesting and challenging topics that we will be talking about and how we think about those topics and how we will go about solving them for our customers and organizations.

S: Well thank you, Lee

L: Absolutely.