Given the following tables how would you know who has the most friends REQUESTS date | sender_id | accepter_id ACCEPTED accepted_at | accepter_id | sender_id
Data Scientist Senior Interview Questions
40,232 data scientist senior interview questions shared by candidates
1) Provided a table with user_id and dates they visited platform, find the top 100 users with the longest continuous streak of visiting the platform as of yesterday. 2) Provided a table with page_id, event timestamp and a flag for a state (which is on/off), find the number of pages that are currently on.
There is a table that tracks every time a user turns a feature on or off, with columns user_id, action ("on" or "off), date, and time. How many users turned the feature on today? How many users have ever turned the feature on? In a table that tracks the status of every user every day, how would you add today's data to it?
Behavioral questions probing about fit
Write a function that takes in two sorted lists and outputs a sorted list that is their union.
They asked probability question: 1) The probability that item an item at location A is 0.6 , and 0.8 at location B. What is the probability that item would be found on Amazon website. 2). I have table 1, with 1million records, with ID, AGE (column names) , Table 2 with 100 records with ID and Salary then the interviewer gave me the following SQL script SELECT A.ID,A.AGE,B.SALARY FROM TABLE 1 A LEFT JOIN TABLE 2 B ON A.ID = B.ID + WHERE B.SALARY > 50000 ( HE ASKED TO MODIFY THIS LINE OF QUERY) How many records would be returned? 3. Give a csv file with ID, and Quantity columns, 50million records and size of data is 2gig, write a program in any language of your choice to aggregate the QUANTITY column.
If 70% of Facebook users on iOS use Instagram, but only 35% of Facebook users on Android use Instagram, how would you investigate the discrepancy?
Write a sorting algorithm for a numerical dataset in Python.
Tell me about yourself
Given the following data: Table: searches Columns: date STRING date of the search, search_id INT the unique identifier of each search, user_id INT the unique identifier of the searcher, age_group STRING ('<30', '30-50', '50+'), search_query STRING the text of the search query Sample Rows: date | search_id | user_id | age_group | search_query -------------------------------------------------------------------- '2020-01-01' | 101 | 9991 | '<30' | 'justin bieber' '2020-01-01' | 102 | 9991 | '<30' | 'menlo park' '2020-01-01' | 103 | 5555 | '30-50' | 'john' '2020-01-01' | 104 | 1234 | '50+' | 'funny cats' Table: search_results Columns: date STRING date of the search action, search_id INT the unique identifier of each search, result_id INT the unique identifier of the result, result_type STRING (page, event, group, person, post, etc.), clicked BOOLEAN did the user click on the result? Sample Rows: date | search_id | result_id | result_type | clicked -------------------------------------------------------------------- '2020-01-01' | 101 | 1001 | 'page' | TRUE '2020-01-01' | 101 | 1002 | 'event' | FALSE '2020-01-01' | 101 | 1003 | 'event' | FALSE '2020-01-01' | 101 | 1004 | 'group' | FALSE Over the last 7 days, how many users made more than 10 searches? You notice that the number of users that clicked on a search result about a Facebook Event increased 10% week-over-week. How would you investigate? How do you decide if this is a good thing or a bad thing? The Events team wants to up-rank Events such that they show up higher in Search. How would you determine if this is a good idea or not?
Viewing 11 - 20 interview questions