You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Interview Experience for Data Engineer (Big Data) - 2021:
First Round:
What is Hive?
What is Hive metastore?
Hive external table vs managed table.
evalute Programming skills - one question on that.
mappers and reducers in Spark execution flow.
what are groupByKey and reduceByKey operations in Spark.
Write spark skeleton code for given scenario - reading a CSV and do some transformations on it - Code doesn't have to be exact but steps matter.
Given a requirement (find the second largest value of a column for every value of other column - Window functions in SQL), what are the steps to achieve it using only spark-sql without dataframes/RDDs API in spark?
Questions on SQL for window functions in spark (SQL query is preferred compared to Dataframes API).
Different storage formats in big data space? questions on that (parquet, CSV, JSON, avro, delta etc).