What is p-value, CI
Data Scientist Interview Questions
40,238 data scientist interview questions shared by candidates
How do you deal with unbalanced classification problem?
I had a sequence of logs from driver that were timestamps which their apps keeps sending every 15 seconds e.g. [10,25,40, 100, 115, 130, ...]. If the gap is more than 15 seconds then the driver is considered offline. Find the number of hours he was online.
Which syntax of SQL is used to slice dataset into several chunks
Given a list, create a new list that does not include the duplicates of the original list.
How many sub-spaces can 4 lines divide in 2D? How many sub-spaces can 4 hyperplanes divide in 3D?
Sample k items uniformly at random in a streaming fashion from a list of unknown length.
Case study
10 questions to do in 2 hours.
3 questions, two sql+ 1 python (though can be also easily solved with sql). The sql question is about finding the most popular track and for each track, the day it had most popularity. the python question is about adding certain advs. seconds on every 3,5,15 stream a user had. The interviewer appears having limited experience with sql, as both interviewers had moments first doubting my code, then admit they made mistake. As one example, when doing an inner joining of two tables I had 'From tableA a, tableB b where a.metric=b.metric, the interviewer first asked me where's my inner joining of the two tables, as in her mind it should be written as 'From TableA a inner join TableB b on a.metric=b.metric'. After I explained, she realized both ways writing are equivalent. Though I got all three questions right and well explained, I did not pass the tech interview and there was no feedback.
Viewing 131 - 140 interview questions