Easy way to get 15 free YouTube views, likes and subscribers
Get Free YouTube Subscribers, Views and Likes

data engineer interview questions

Follow
MANISH KUMAR

In this video I have talked about salting in spark

Directly connect with me on: https://topmate.io/manish_kumar25

Discord channel:   / discord  

Project details for resume :

.Successfully led a data engineering project in a retail environment using technologies such as Apache Spark, Python, SQL, and Amazon S3 to optimize data processing.

.Implemented structured data models, including dimension and fact tables, to provide valuable context for pointofsale data analysis.

Designed and executed an incentive program based on sales performance, enhancing motivation among sales teams by rewarding top performers.

Managed extensive daily data volumes of approximately 100GB, demonstrating the ability to handle largescale data pipelines.

Employed Spark optimization techniques like caching and broadcast joins to improve data processing speed and efficiency.

Utilized Azure CI/CD pipelines for code deployment, and orchestrated workflows using Airflow and CRON jobs.


Detailed writeup to explain more during interview:

As a Data Engineer on a project for a prominent offline grocery and kitchen supplies retailer, I applied my expertise in data engineering to drive critical improvements in their data processing and analysis operations.

The project primarily focused on processing and analyzing pointofsale data, which was structured into dimension and fact tables to provide meaningful context for sales analysis. To further enhance employee motivation and performance, we designed and implemented an incentive program that rewarded salespeople with the highest sales volumes in each store.

Handling a substantial daily data volume of approximately 100GB, we leveraged Apache Spark and applied optimization techniques like data caching and broadcast joins to significantly accelerate data processing. This not only improved the speed of our data pipelines but also increased the efficiency of our data analysis.

We seamlessly integrated the code deployment process into the Azure CI/CD pipeline. As part of workflow automation, we orchestrated task scheduling using Airflow and CRON jobs.

One of the project's major achievements was the implementation of a customer engagement strategy that identified infrequent buyers and provided incentives in the form of coupons. This initiative not only boosted customer retention but also had a positive impact on the overall business growth.



For more queries reach out to me on my below social media handle.

Follow me on LinkedIn:   / manishkumar373b86176  
Follow Me On Instagram:   / competitive_gyan1  
Follow me on Facebook:   / manish12340  

My Second Channel    / @competitivegyan1  

Interview series Playlist:    • Interview Questions and answers  


My Gear:
Rode Mic: https://amzn.to/3RekC7a
Boya M1 Mic https://amzn.to/3uW0nnn
Wireless Mic: https://amzn.to/3TqLRhE
Tripod1 https://amzn.to/4avjyF4
Tripod2: https://amzn.to/46Y3QPu
camera1: https://amzn.to/3GIQlsE
camera2: https://amzn.to/46X190P
Pentab (Medium size): https://amzn.to/3RgMszQ (Recommended)
Pentab (Small size): https://amzn.to/3RpmIS0
Mobile: https://amzn.to/47Y8oa4 ( Aapko ye bilkul nahi lena hai)
Laptop https://amzn.to/3Ns5Okj
Mouse+keyboard combo https://amzn.to/3Ro6GYl
21 inch Monitor https://amzn.to/3TvCE7E
27 inch Monitor https://amzn.to/47QzXlA
iPad Pencil: https://amzn.to/4aiJxiG
iPad 9th Generation: https://amzn.to/470I11X
Boom Arm/Swing Arm: https://amzn.to/48eH2we

My PC Components:
intel i7 Processor: https://amzn.to/47Svdfe
G.Skill RAM: https://amzn.to/47VFffI
Samsung SSD: https://amzn.to/3uVSE8W
WD blue HDD: https://amzn.to/47Y91QY
RTX 3060Ti Graphic card: https://amzn.to/3tdLDjn
Gigabyte Motherboard: https://amzn.to/3RFUTGl
O11 Dynamic Cabinet: https://amzn.to/4avkgSK
Liquid cooler: https://amzn.to/472S8mS
Antec Prizm FAN: https://amzn.to/48ey4Pj

posted by Liaigemiowawssy