What is a Data Scientist?

data scientist

Data Science is “the sexiest job of the 21st century” according to Harvard Business Review… That sounds great, and for those of us floating in clouds of data gathering beautiful insights, indeed it can be a great job. It’s a blend of statistical and coding skills with business insight, problem solving, database skills, and the ability to understand and communicate very well with non-technical business people and formulate their important (but fuzzy) problems in good, clear technical ways. However, it also isn’t a job for everyone, and we sometimes find there is confusion amongst job applicants and new starters at King about what the job really involves.

Part of this confusion undoubtedly stems from King’s particular needs, and these needs will be different in other companies or industries. Some of the confusion stems from the fact that sites such as Kaggle  call themselves ‘the home of data science’. While Kaggle is great, it only really does predictive analytics, which doesn’t apply to 90% of what we do at King.

This is what we say in the current King job description for Data Scientists:

If you want to improve the experience of King’s 300 million+ players across our network of games and help us to further understand, model, predict, segment, monetise and retain those customers, then this may be the right opportunity for you.
You will:

  • Identify potential business opportunities within your area and scope/design approaches to capture those opportunities.
  • Translate business needs to technical requirements and AB-tests, and work with development teams to ensure correct implementation, specifically including correctly tracking the right information to allow easy, precise analysis of the impact.
  • Develop an analysis strategy and perform analysis of complex scenarios and AB-tests, both systematically and on a one-off basis.
  • Carefully check, debug, and problem solve issues to ensure you deliver accurate and clear analysis and reports, quickly, even when confronted by subtle data complications.
  • Provide an analytics perspective to discussions and prioritisation within your team, so that the overall team selects the right ideas to work on.
  • Be the pro-active owner of the entire data chain for your game(s).

So, where does this go wrong? And how can someone know if this is the right job for them?

Well, sometimes we find people who are technically fantastic and if given a super-clear technical problem to solve will do great. But data science doesn’t involve super-clear technical problems! Most problems start out as fuzzy, non-technical business problems and the first step is to understand enough to translate that into a suitable technical problem. That’s hard, since ‘suitable’ will mean making approximations and simplifications that require good intuitions about what is and isn’t important to the players, games and to the business. And then that technical problem will almost certainly need to be further adapted as you learn more from the technical work. So the technical wizard who doesn’t genuinely understand the players, games and business ends up solving the wrong problem, and even if given the right problem to start with, soon ends up going down an unhelpful track as the problem evolves.

Some problems genuinely require sophisticated statistical techniques or the latest machine learning approach (some years ago this was Random Forests, now it’s Deep Learning), but many problems don’t. And almost all problems we face at King benefit from starting simple, gaining both technical and business understanding, and only adding complexity where it is really needed. Complex models often obscure the important questions about our players – the questions like ‘why?’ or ‘in what context?.’ So, while Kaggle may make data science look like it is all about trying to predict some outcome 0.1% better than the next person, at King we are only rarely concerned about prediction. We wouldn’t care at all about a 0.1% better prediction, or even a 5% better prediction if it meant we lost clarity of the ‘why?’ something is relevant to our players. So while we do have a handful of specific data science roles at King where machine learning and predictive analytics skills are essential (we really care about forecasting the lifetime value of our players, for example), in 90% of our roles these skills are rarely if ever used.

At King we also make a fairly clear distinction between ‘Data Engineers’ and ‘Data Scientists.’ Data Engineers care about storing and organising the data, transforming, and processing it – making cleaned data available in high-performance, reliable systems for everyone else in the business to use, and indeed choosing what technologies are best suited to managing that whole data chain. While it is sometimes valuable for some data scientists to be closely involved in that work (and so we like to hire a few data scientists who have some of those skills on the side), most data science work at King starts with the assumption that the data is already available in pretty good shape. Of course, with more than 300 million active players in a confusing, changing world of multiple devices, platforms and games, even ‘pretty good shape’ is never quite as perfect as we’d like, so, like all scientists, Data Scientists do need to carefully validate and check their work and assumptions and therefore be good data detectives as well.

In summary, for us at King, data science is really just about being a scientist with data. So being a great Data Scientist is about being a great scientist. What skills does a great scientist have? Well, a great scientist develops a deep understanding of the fundamental properties of nature, and our ‘nature’ is our players, games, and our business’ interaction with them. So our Data Scientists need to have the ability to develop a deep understanding of the nature of players, games, and King’s business. Also, a great scientist can design careful, validated experiments that provide new, clear insights into nature, so our data scientists need those skills too. Of course a great scientist needs to have the technical skills that are appropriate too – for us that’s mostly Statistics, SQL, R, Python. Finally, while some scientists are renowned for being bad communicators, if you’ve read this far I think you’ll understand that doesn’t work for us – great communication skills are essential for data scientists (a) to genuinely understand what problem they should be solving and (b) for their work to have the biggest impact on their colleagues, game teams, producers, etc. So the job really requires a mixture of hard and soft skills – and while we don’t necessarily expect all our data scientists to have these skills on their first day in the office, we do expect them to be able to learn and develop expertise in every one of these areas.

data scientist

In closing, since what we really care about is making the best possible player experience in our games, and we’re talking about Data Scientists actually doing science on games and players, perhaps a better name for the job that most Data Scientists do at King would be ‘Game Scientist’ or ‘Player Scientist’… And with that player-centric focus, this is perhaps not that different from what some firms are calling a ‘quantitative user experience researcher,’ and almost certainly means our official Data Scientist job description needs a bit of an update…

So, back to where we started – is it the sexiest job of the 21st century? Well, more often than we’d like we’re not really floating in data clouds gathering beautiful insights, but rather digging in the data ditches trying to make sense of a messy world, with a shovel that isn’t strong enough and bug-ridden soil that is crumbling around us, while the game needs answers ‘right now’. In that world those beautiful insights can be quite elusive, and to help we certainly need better tools, better systems, better environments. But that’s a story for another day…

Vince Darley

About Vince Darley

As Chief Scientist, Vince helps King build a world leading Data Science team - awesome people, tools, systems, methodologies, and the best analytics platforms for experimentation and segmentation for them. Vince also leads a research team digging deeper into what our data tells us about our players and games; gathering insights to benefit the business. Vince has been working on analytics in its various forms for close to 20 years in both the US and UK, where he currently lives. Besides big data he likes big runs, having completed several 100-mile races and other ultramarathons over recent years.

Leave a Reply

Your email address will not be published. Required fields are marked *