Portfolio_by

CVS Health – Sr. Data Engineer (jul 2023 – Present)

Migrated Oracle systems to BigQuery and developed scalable ETL pipelines with Apache Airflow.
Replaced legacy Hadoop infrastructure with GCP Dataproc, reducing costs by 40%.
Built real-time and batch data pipelines using Dataflow, Pub/Sub, and BigQuery.
Designed PySpark frameworks for large-scale cleansing, merging, and transformation.
Implemented CI/CD pipelines via Jenkins and Cloud Build to streamline deployments.
Architected a hybrid cloud setup using GCP and AWS for high availability and disaster recovery.

Developed Azure-based modern data platforms supporting real-time reporting.
Built ADF pipelines for ingestion and transformation across Azure SQL and Blob Storage.
Designed Spark jobs using PySpark for complex transformations and aggregations.
Tuned Spark workloads for optimal memory and performance.
Created advanced UDFs in PySpark and Scala to handle custom business logic.
Automated pipeline deployment using JSON templates and improved CI/CD reliability.

Built ETL pipelines using ADF, Azure Synapse, and Databricks to unify enterprise data.
Automated job monitoring via Python SDK and real-time SQL dashboards.
Flattened complex JSON with Python UDFs and optimized Spark transformations.
Integrated Synapse with Azure services to enhance data governance and scalability.
Used self-hosted runtimes to bridge Hadoop with Azure Data Factory for secure migration.
Engineered real-time ingestion pipelines using Kafka, Spark Streaming, and Apache NIFI.

Developed multi-threaded standalone app in Python, to view Circuit parameters and performance.
Worked with team of developers on Python applications for RISK management.
Developed entire frontend and backend modules using Python on Django Web Framework.
Provided GUI utilizing PyQt for the end user to create, modify and view reports based on client data.
Designed and Developed User Interface using front-end technologies like HTML5, CSS3, JavaScript, Bootstrap and JSON.
Wrote and executed various MYSQL database queries from Python using Python-MySQL connector and MySQL dB package.