What do you know about this company?
Lead Data Engineer Interview Questions
239 lead data engineer interview questions shared by candidates
Please prepare an architectural diagram of a data platform you have been involved in and you feel is relevant to Outra. In the interview you will be asked to talk through the various components, the technical choices, potential issues and any recommendations for the future.
1) Project explanation 2) Spark Optimization 3) Data quality process and ways to handle it 4) Questions on Databricks, Datafactory and few services from Azure mentioned in the CV 5) The whole interview was a discussion rather than just one way process which made me very comfortable
to write SQL queries and Pyspark coding to load, filter, aggregate and save as another new table, strategies to design ETL validations to merge to consume kafka and s3 bucket files. Strategies to consume data from s3 bucket focusing on spark architecture, designing clusters, shuffling, narrow/wide transformations, authenticating Azure data lakes, resource manager in spark.
Is it possible to assign a specific broker to a Kafka producer?
Most questions was about python and Hadoop
SQL Problem statement Data Modeling Data Architecture
Build a sql db with a json file they provided, so you needed to ingest, normalise, transform and provide 5 sql reports (top 5 items, top sales etc.). The instructions were to use any method/language you preferred as long as you go the output and they mentioned it should take you +- 3h to do this exercise.
Python question: to segregate lower and upper case records from a list
What were my expectations? What do I know about Quantexa? Why Quantexa? Have I worked with financial crimes solutions before?
Viewing 201 - 210 interview questions