A quick handbook to Data Scientist interview at Twitter

Do you know why Data Scientists are the most seeked after people at Twitter? It is because this tech giant has around 321 million active users each month! And handling such a large data set is important.

Twitter has become the best and quickest medium for staying up to date with the world. With so many users contributing on their platform, Twitter owns one of the world’s largest, real-time datasets. This is where their need for Data Scientists stands important.

Their Data Science and Analytics teams deploy advanced analytics and machine learning tools to improve their products and features toward delivering more relevant content on their feeds.

What does a Data Scientist at Twitter do?

The Data Scientists role is divided into two parts which are data and research roles. These roles are further narrowed down to their specific parts in specific teams.

These roles depend heavily on the teams that they are assigned to in specific features or services and the role may cover analytics-based roles to model design and to building heavy machine learning systems.

Twitter prefers to comb through and only hire skilled professionals with a minimum of 2+ years of experience and a hands-on background in data infrastructure or backend systems.

Therefore having an engineering background or understanding data systems is handy, unless the role is specific to analytics.

What are the basic qualifications required for a Data Scientist position at Twitter?

Bachelor’s or Master’s in Computer Science, Statistics, Math, Engineering, or other quantitative disciplines.
Experience with analysing large data sets and Map Reduce architectures like Hadoop and other open-source data mining and machine learning projects.
Substantial experience using numerical programming languages like Python, SQL, R, Sparks, or Scalding for writing complex data flow.
Proficient in the use of Tableau, or Zeppelin for analysis, modelling, and data visualization.
Experience in applying advanced statistical techniques to model user behavior, identify causal impact and attribution, build and benchmark metrics.

What is the Data Scientist interview process at Twitter?

The interview follows a very standardised process that includes 3 rounds. The 1st round is a phone screening where a recruiter gets on a call with the candidate. The 2nd round focuses on a technical screening and the 3rd round is an on-site interview.

Data Scientist Interview rounds at Twitter

Round- 1 Phone Screening

This round is focused on getting to know the candidate and discussing the job role applied for. The recruiter will ask you questions about your resume, experience and your goals.

The recruiter will walk you through the whole hiring process and give further explanation of the job position. You might even be asked questions related to Twitter.

Pro Tip: Make sure that you are well versed with your resume as answering the questions will be much easier. Do cross question the recruiter for any doubts you may have.

Round- 2 Technical Screening

The second round is set to completely test your technical skills and knowledge. This round will be led by an in-house Data Scientist. The questions asked will range from topics like SQL/Python based coding, Machine learning, data sets and product intuition with a focus on experimentation.

Pro Tip: Make sure to research and study how Twitter as a product works and think about questions related to ‘driving results out of experiment-based testing’.

Round- 3 On-site Interview

The last round consists of one-on-one interviews with 5-6 people who are usually Data Scientists and Data Engineers at Twitter. Each round will last for approximately 45 minutes. You will be given a few case studies in this round and a Q&A will follow.

Pro Tip: Practice Algorithm questions as well as whiteboard coding which ranges from machine learning to statistics/probability and product based questions.

The on-site interview is a combination of a wide range of technical concepts. Do study experimental and A/B testing design questions, SQL, machine learning questions, and product type questions.

You may be subjected to behavioural questions as well that will portray your previous experience and culture fit at Twitter.

Few sample questions from the Twitter Database

What would you change in the Twitter app? How would you test if the proposed change is effective or not?
Design a system to find top ten twitter hashtags in the most recent 1 min, 10 min, 1 hr.
How would you measure user engagement given all of Twitter’s analytics and tracking data?
Write a query in SQL to measure the number of ads viewed in moments versus news feed.
Given a two-column file with user codes and counts, retrieve the top-k users based on a score that is a function of the number of times they appear on the file and these counts.
Given a list of all followers in format: 123, 345;234, 678;345, 123;…where the first column contains the ID of the follower, and the second one is the ID of who’s followed, find all mutual follows(pair 123, 345 in the example above). Do the same in the case, when this list does not fit into the memory.
If you got the job at Twitter and got access to all of its data what kind of data analysis would you like to perform?
How can you illustrate a tree-based system with a SQL query?

20 interview questions to practice for Data Scientist role at Twitter

Here are questions that will help you sail through the interview process:

EASY

What is Regularization and what kind of problems does regularization solve?
Explain the difference between lists and tuples.
Write a program in Python that takes input as the weight of the coins and produces output as the money value of the coins.
Which package is used to do data import in R and Python? How do you do data import in SAS?
Which would you prefer– R or Python?
What packages are used for data mining in Python and R?
What are the alternatives to PyTorch?
What is PyTorch?
What are lambda functions?
What libraries do Data Scientists use to plot data in Python?

INTERMEDIATE

Why do Data Scientists use combinatorics or discrete probability?
What is Collaborative filtering?
What is an API? What are APIs used for?
What is the difference between squared error and absolute error?
What is association analysis? Where is it used?
Differentiate between univariate, bivariate, and multivariate analysis.
What does P-value signify about the statistical data?

HARD

List differences between DELETE and TRUNCATE commands.
How will you get the second highest salary of an employee emp from employee_table?
How is SQL different from NoSQL?

If you like this blog you can visit our blog page right here.