python and SQL in telephonic Hadoop input file formats, when to use what, design streaming system, Hive optimization, Spark implementation, Python multi-environment related questions Agile and behavioural rounds
Senior Data Engineer Interview Questions
2,590 senior data engineer interview questions shared by candidates
Shared in DescriptionQuestion1) If we have input.csv, we need to find the output. File and desired output are given below. username, mobile user1,999999991:888888882 user3,777777771 user2,777777234:823232351 user5,734452343:943433434:834323434 user1,999999991:9994433777 output user1:3 user2:2 user3:1 Question2) How can we read a csv file into dataframe Question3) Option to modify the encoding while reading a file in Scala Question 4) Optin to modify the timestamp while reading a file Question 5) How to introduce separators like "," while reading a file Question 6) How to infer Schema =============================== Question 7) How have below 2 tables, we need to find out users who visited a bank but didn't make any transactions? -- Visits table: -- +---------+------------+ -- | user_id | visit_date | -- +---------+------------+ -- | 1 | 2020-01-01 | -- | 2 | 2020-01-02 | -- | 12 | 2020-01-01 | -- | 19 | 2020-01-03 | -- | 1 | 2020-01-02 | -- | 2 | 2020-01-03 | -- | 1 | 2020-01-04 | -- | 7 | 2020-01-11 | -- | 9 | 2020-01-25 | -- | 8 | 2020-01-28 | -- +---------+------------+ -- Transactions table: -- +---------+------------------+--------+ -- | user_id | transaction_date | amount | -- +---------+------------------+--------+ -- | 1 | 2020-01-02 | 120 | -- | 2 | 2020-01-03 | 22 | -- | 7 | 2020-01-11 | 232 | -- | 1 | 2020-01-04 | 7 | -- | 9 | 2020-01-25 | 33 | -- | 9 | 2020-01-25 | 66 | -- | 8 | 2020-01-28 | 1 | -- | 9 | 2020-01-25 | 99 | -- +---------+------------------+--------+
The main portion of the interview is their HackerRank assignment. I was told I received the assignment for Data Engineers.
what should you do when the interviewer doesn't join the interview meeting?
Prime Factorization... They expected me to remember the Sieve of Eratosthenes off the top of my head... That has absolutely nothing to do with the job role they described and I highly doubt they use it in their code base.
4 rounds of interviews: 1)Technical test was hosted on Hacker earth. 2) HR 3) Hiring Manager 4) Co Founder assess "cultural fit" (Are you willing to work like a donkey for the company)
How did you solve data source cleansing issues? How did you solve gaps in data?
How would you handle a situation where there was conflict between yourself and another coworker?
Given a function rand5() that returns a uniform random number between 1 and 5, how do you make a function that returns a uniform random number between 1 and 7.
Calculate the median value of a given unsorted array. Find the time complexity of the solution. How to improve the solution?
Viewing 131 - 140 interview questions