1. Describe what would happen if we change the loss function for classification problem from Cross entropy to MSE
2. Explain what would happen if Wq = Wk = Wv in a transformer
Interview questions [1]
Question 1
Behavioural questions related to how to handle day to day responsibilites
I applied through a recruiter. I interviewed at Amazon (Bengaluru) in Nov 2024
Interview
Consisted of 5 rounds after recruiter screen, which is very typical process for amazon. Each one looks for specific alignment :
2 technical rounds
1 system design round
1 behavioural round
1 leetcode
Interview questions [1]
Question 1
Technical rounds were on ML, and previous project experience. It involved basics of transformer, and understanding of each function.
‘If you have a task where order of words/ tokens didn’t matter, how would you train for it?’
5 interviews - 2 coding, 1 ML breath, 1 ML depth, 1 ML application.
A lot of behavioral questions (more than half of each interview).
application interview was fair - focused on search results. Interviewers were nice. The process itself was too slow IMO,