As a Senior Data Engineer with over 9 years of experience, I specialize in architecting and implementing scalable, cloud-native data solutions across AWS, GCP, and Azure. I’ve led complex migrations, built high-performance ETL/ELT pipelines, and optimized big data processing using tools like PySpark, Kafka, and Dataflow. My expertise spans serverless architectures, data lakehouse design, and real-time analytics, with a strong focus on data governance, automation, and cross-functional collaboration. Passionate about delivering reliable, cost-effective solutions, I bridge the gap between technical execution and business impact, helping organizations turn data into a strategic asset.
Professional Experience
CVS Health – Sr. Data Engineer (jul 2023 – Present)
- Migrated Oracle systems to BigQuery and developed scalable ETL pipelines with Apache Airflow.
- Replaced legacy Hadoop infrastructure with GCP Dataproc, reducing costs by 40%.
- Built real-time and batch data pipelines using Dataflow, Pub/Sub, and BigQuery.
- Designed PySpark frameworks for large-scale cleansing, merging, and transformation.
- Implemented CI/CD pipelines via Jenkins and Cloud Build to streamline deployments.
- Architected a hybrid cloud setup using GCP and AWS for high availability and disaster recovery.
UBS – Data Engineer (Feb 2021 – Dec 2021)
- Developed Azure-based modern data platforms supporting real-time reporting.
- Built ADF pipelines for ingestion and transformation across Azure SQL and Blob Storage.
- Designed Spark jobs using PySpark for complex transformations and aggregations.
- Tuned Spark workloads for optimal memory and performance.
- Created advanced UDFs in PySpark and Scala to handle custom business logic.
- Automated pipeline deployment using JSON templates and improved CI/CD reliability.
Vodafone Group – Big Data Engineer (May 2018 – Jan 2021)
- Built ETL pipelines using ADF, Azure Synapse, and Databricks to unify enterprise data.
- Automated job monitoring via Python SDK and real-time SQL dashboards.
- Flattened complex JSON with Python UDFs and optimized Spark transformations.
- Integrated Synapse with Azure services to enhance data governance and scalability.
- Used self-hosted runtimes to bridge Hadoop with Azure Data Factory for secure migration.
- Engineered real-time ingestion pipelines using Kafka, Spark Streaming, and Apache NIFI.
Sterlite Technologies – Python Developer (Sep 2015 – Mar 2018)
- Developed multi-threaded standalone app in Python, to view Circuit parameters and performance.
- Worked with team of developers on Python applications for RISK management.
- Developed entire frontend and backend modules using Python on Django Web Framework.
- Provided GUI utilizing PyQt for the end user to create, modify and view reports based on client data.
- Designed and Developed User Interface using front-end technologies like HTML5, CSS3, JavaScript, Bootstrap and JSON.
- Wrote and executed various MYSQL database queries from Python using Python-MySQL connector and MySQL dB package.
Education
Master of Science in Computer Technology, Eastern Illinois University, 2023.