Linux Questions
Senior Data Engineer Interview Questions
2,590 senior data engineer interview questions shared by candidates
General Spark Questions. Nothing complicated
3rd Round revolves around tic-tac-toe problem solving skills.
How to implement CDC in ADF with native functionality?
Problem: o A traveler flies to many cities (airports) in an unbroken chain of flights with no loops i.e never revisiting an airport. o For every flight, she has a boarding pass with only a From (City) and To (City) printed on it but no date/time. o At the end of her journey, she hands you all her boarding passes but they’re shuffled, so you don’t know the starting or the ending city. Can you: o Write logic or pseudocode to print her whole journey in sequence. It should print e.g. (Starting) City1 -> City2 ->….-> (Ending) CityX o State the time complexity of your solution. o you’re given a Set of BoardingPass objects as input. o there could be as many as hundreds of thousands of unique cities/airports. o memory is no concern (i.e. you have infinite memory!). Optimize for execution time (time complexity).
What is conformed dimension How many executors in Spark
sorting without using python built in function
Questions involved: Current Roles and Responsibilities Past Projects Tech Stack Sister Tools and technologies Marketing Domian knowledge Analytical Approach to solve a real time problem SQL Big Data Concepts Data Pipelines - Key things (Data Validation, Quality Checks) BI & Data Science concept and data consumption
1. Previous Works 2. Joining of Datasets and Questions in Relation to CDC Logics. 3. SCD-related questions and backfill scenarios 4. Architecture of Abinitio (since my tool was Abinitio)
Traditional SQL vs Hive Traditional SQL vs Hive
Viewing 191 - 200 interview questions