Satyanarayana Annepogu

Database Developer

Location

Toronto, ON, Canada

Toptal Member Since

October 25, 2022

Satya is a senior data engineer with over 15 years of IT experience designing and developing data warehouses for banking and insurance clients. He specializes in designing and building modern data pipelines and streams using AWS and Azure Data engineering stack. Satya is an expert in delivering modernization of enterprise data solutions using AWS and Azure cloud data technologies.

Informatica Data Engineering Data Warehousing Data Warehouse Design Data Analysis Data Analytics Migration SQL ETL Data Pipelines Relational Databases Databases Oracle Business Intelligence (BI)Data Integration

Satyanarayana is available for hire

Hire Satyanarayana

Portfolio

Heimstaden Services AB

Azure Data Factory, Data Engineering, Data Pipelines, SQL...

IBM

Amazon CloudWatch, Amazon RDS, Amazon S3 (AWS S3), Amazon EC2...

IBM

Autosys, Azure Data Factory, Azure Databricks, Azure SQL, Azure SQL Databases...

Experience

ETL Tools - 14 yearsPython - 4 yearsAzure Databricks - 4 yearsAzure Synapse - 4 yearsApache Airflow - 4 years Redshift - 4 years AWS Glue - 4 years Amazon Web Services (AWS) - 4 years

Location

Availability

Part-time

Preferred Environment

Azure Data Factory, Azure Databricks, Informatica ETL, Amazon Web Services (AWS), Apache Airflow, Redshift, AWS Glue, Python, PostgreSQL 10.1, Azure Synapse

The most amazing...

...project I've done is designing, developing, and supporting cloud-based and traditional data warehouse applications.

Work Experience

2022 - 2023

Data Analyst

Heimstaden Services AB

Acted as a senior data engineer with demonstrated analyst skills and worked on ETL architecture solutions.
Performed requirements assessments and designed suitable data flows or data batches.
Handled solutions optimization and end-to-end data pipelines with data integrity.
Designed and developed ETL processes in AWS Glue to migrate campaign and API data with various file types (JSON, ORC, and Parquet) into Amazon RedShift.
Designed and developed ETL processes to extract Salesforce data and load it into Amazon Redshift.

Technologies: Azure Data Factory, Data Engineering, Data Pipelines, SQL, Business Intelligence (BI), ETL Tools, Scripting Languages, APIs, Data Wrangling, Amazon S3 (AWS S3), AWS Lambda, Spark, AWS Glue, Amazon EC2, Amazon Elastic MapReduce (EMR), Amazon RDS, Redshift, SQL Stored Procedures, Amazon Aurora, Apache Airflow, Data Analysis, Data Analytics, Amazon CloudWatch, Amazon QuickSight, AWS Data Pipeline Service, PostgreSQL 10.1, Azure SQL Data Warehouse (SQL DW), PostgreSQL, Database Optimization, Database Architecture, XML, CI/CD Pipelines, GitHub, Excel 2016, Tableau, Data Build Tool (dbt), NoSQL, Webhooks, BI Reporting, Database Migration, CDC, Data-driven Dashboards, DAX, Microsoft Power BI, Business Services, Apache Spark, Database Design, Database Structure, Database Transactions, Transactional Data, MySQL, Microsoft Excel, Real Estate, Geospatial Data, OLTP, OLAP, DevOps, Data

2020 - 2022

AWS Data Engineer

IBM

Designed and implemented data pipelines using AWS services such as S3, Glue, and RedShift.
Developed and maintained data processing and transformation scripts using Python and SQL. Optimized data storage and retrieval using AWS database services such as RDS and DynamoDB.
Built and maintained data warehouses and data lakes using AWS Redshift and Athena.
Implemented data security and access controls using AWS IAM and KMS. Monitored and troubleshot data pipelines and systems using AWS CloudWatch and other monitoring tools.
Collaborated with data scientists and analysts to provide data insights and support their data needs.
Automated data processing and deployment using AWS Lambda and other serverless technologies.
Developed and maintained ETL workflows using AWS Step Functions and other workflow tools. Stayed up-to-date with the latest AWS data services and technologies and recommended new solutions to improve data engineering processes.

Technologies: Amazon CloudWatch, Amazon RDS, Amazon S3 (AWS S3), Amazon EC2, Amazon Web Services (AWS), AWS Glue, AWS IAM, Redshift, Amazon DynamoDB, Python, SQL, PostgreSQL 10.1, PostgreSQL, Database Optimization, Lambda Functions, Database Architecture, Elasticsearch, AWS Cloud Architecture, XML, CI/CD Pipelines, GitHub, Excel 2016, Tableau, NoSQL, Webhooks, BI Reporting, CDC, Business Services, Apache Spark, Database Design, Database Structure, Database Transactions, Transactional Data, Microsoft Excel, OLTP, OLAP, DevOps, Identity & Access Management (IAM), Data

2018 - 2020

Azure Data Engineer and Data Warehouse Consultant

IBM

Designed and developed data ingestion pipelines using ADF and processing layer using Databricks and notebooks with PySpark. Led the planning, development, testing, implementation, documentation, and support of data pipelines.
Implemented various aspects of the project, including pause and resume Azure SQL data warehouse using ADF, ADF pipelines with business rules use cases as reusable asset Ingestion of CSV, fixed width, and excel files.
Collaborated with a client and IBM ETL teams, analyzed on-premises Informatica-based ETL solutions, and designed ETL solutions using Azure Data Factory pipelines and Azure Databricks PySpark and Spark SQL.
Worked with technical and product stakeholders to understand data-oriented project requirements and help implement the solution's Azure infrastructure components as part of the solution to create the first usable iteration of the CPD application.
Orchestrated and automated the pipelines POCs with Apache Spark using PySpark and Spark SQL for various complex data transformation requirements.
Used PowerShell scripts for automation of pipelines and Azure Data Factory and Azure Databricks for performance tuning of pipelines.

Technologies: Autosys, Azure Data Factory, Azure Databricks, Azure SQL, Azure SQL Databases, Azure Synapse, Data Engineering, SQL, Data Pipelines, JSON, ETL, T-SQL (Transact-SQL), Python, Pipelines, Data Management, Azure, Dimensional Modeling, Data Lakes, Data Architecture, Microsoft SQL Server, Migration, Query Composition, Performance Tuning, Data Warehouse Design, Data Warehousing, Databricks, Relational Databases, Databases, Analytics, Azure Data Explorer, Consulting, Python 3, CSV File Processing, XLSX File Processing, CSV, Postman, Business Intelligence (BI), ETL Tools, Data Migration, Scripting Languages, Orchestration, Machine Learning, APIs, Technical Project Management, Kanban, ETL Development, Data Wrangling, Amazon S3 (AWS S3), Big Data, AWS Lambda, Spark, AWS Glue, Data Transformation, Amazon EC2, Amazon Elastic MapReduce (EMR), Amazon RDS, Redshift, SQL Stored Procedures, Normalization, Scala, Shell Scripting, Architecture, Data Integration, Google Cloud Platform (GCP), Amazon Aurora, Apache Airflow, Data Analysis, Data Analytics, Pandas, Amazon Web Services (AWS), AWS IAM, Amazon CloudWatch, Amazon DynamoDB, PostgreSQL 10.1, Azure SQL Data Warehouse (SQL DW), PostgreSQL, Database Optimization, Database Architecture, XML, CI/CD Pipelines, GitHub, Excel 2016, Tableau, Data Build Tool (dbt), NoSQL, Database Migration, Data-driven Dashboards, DAX, Microsoft Power BI, Business Services, Apache Spark, Database Design, Database Structure, Database Transactions, Transactional Data, MySQL, Microsoft Excel, OLTP, OLAP, Data

2009 - 2018

Senior ETL Consultant and Team Lead

IBM

Developed solutions in a highly demanding environment and provided hands-on guidance to other team members. Headed complex ETL requirements and design and assessed requirements for completeness and accuracy.
Implemented Informatica-based ETL solution fulfilling stringent performance requirements. Collaborated with product development teams and senior designers to develop architectural requirements to ensure client satisfaction with the product.
Determined if requirements were actionable for the ETL team and conducted an impact assessment to determine the size of effort based on needs.
Developed entire Software Development Lifecycle (SDLC) project plans to implement ETL solutions and identify resource requirements.
Assisted and verified solutions design and production of all design phase deliverables. Managed the build phase and quality assurance code to fulfill requirements and adhere to ETL architecture. Resolved difficult design and development issues.
Provided the team with the vision of the project's objectives, ensured discussions and decisions led toward closure, and maintained healthy group dynamics.
Familiarized the team with customer needs, specifications, design targets, development process, design standards, techniques, and tools to support task performance.
Performed an active, leading role in shaping and enhancing overall ETL Informatica architecture. Identified, recommended, and implemented ETL process and architecture improvements.

Technologies: Informatica ETL, Netezza, Autosys, Unix Shell Scripting, IBM Db2, Data Engineering, SQL, Data Pipelines, JSON, ETL, Pipelines, Data Management, Informatica, Informatica Cloud, Data Modeling, Dimensional Modeling, PL/SQL, Data Architecture, Query Optimization, Query Composition, Performance Tuning, Data Warehousing, Relational Databases, Databases, Analytics, Consulting, XLSX File Processing, CSV, Business Intelligence (BI), ETL Tools, Scripting Languages, Orchestration, Technical Project Management, Kanban, ETL Development, Data Wrangling, SQL Stored Procedures, Normalization, Shell Scripting, Architecture, Data Analysis, Data Analytics, Excel Macros, Pandas, Amazon Web Services (AWS), AWS IAM, Amazon CloudWatch, Amazon QuickSight, AWS Data Pipeline Service, Database Optimization, Database Architecture, Oracle PL/SQL, PL/SQL Tuning, CI/CD Pipelines, Excel 2016, Database Administration (DBA), Database Structure, Database Transactions, Transactional Data, MySQL, Microsoft Excel, OLTP, OLAP, Data

2008 - 2009

Senior ETL Developer

Genesys

Developed mapping for type two dimension for updating already existing rows and inserting new rows in targets. Worked on actuating for formatting reports related to different processes.
Created and developed actuate reports like drill-up and drill-down, series, and parallel. Analyzed the number of reports generated, failed, waiting, and scheduled.
Built dashboards for generated, failed, waiting, and scheduled reports concerning quarter-hour, hour, day, month, and year.

Technologies: Informatica ETL, Unix Shell Scripting, Control-M, Data Engineering, SQL, Data Pipelines, JSON, ETL, Pipelines, Data Management, Informatica, PL/SQL, Data Architecture, Query Optimization, Query Composition, Performance Tuning, Data Warehouse Design, Data Warehousing, Relational Databases, Databases, CSV, ETL Tools, Orchestration, Kanban, ETL Development, Data Wrangling, SQL Stored Procedures, Shell Scripting, Data Integration, Excel Macros, Database Optimization, Oracle PL/SQL, PL/SQL Tuning, Excel 2016, Database Transactions, Microsoft Excel, OLTP, OLAP, Data

2007 - 2008

Senior ETL Developer

Magna Infotech Ltd

Managed ETL development and data warehousing application support activities.
Acquired hands-on experience in dimensional modeling up to ETL design.
Developed mapping for type two dimension for updating existing rows and inserting new ones in targets.

Technologies: Informatica ETL, Unix Shell Scripting, Oracle, Data Engineering, SQL, Data Pipelines, ETL, Pipelines, Data Management, Informatica, Dimensional Modeling, PL/SQL, Data Architecture, Query Composition, Performance Tuning, Data Warehouse Design, Data Warehousing, Relational Databases, Databases, ETL Tools, ETL Development, SQL Stored Procedures, Excel Macros, Oracle PL/SQL, PL/SQL Tuning

Experience

Tool Client Rate (TCR) Desk

TCR Desk is a web-based tool providing authoritative cash management pricing arrangements and contact information for mid and large corporate client segments. The business contact center, relationship managers, and cash management sales personnel utilize the application.

TCR Desk application migration solution leverages best practices of Azure's Well-architected framework in compliance with the client's Azure Service Governance rules to make the solution secure, resilient, highly available, and scalable. These design principles are for implementation in the client's Azure production environment. The same design will be implemented in disaster recovery and lower environments without high availability and disaster recovery.

Contribution
• Designed and developed data ingestion pipelines using ADF and a processing layer using Databricks and notebooks with PySpark.
• Led the planning, design, development, testing, implementation, documentation, and support of data pipelines.
• Collaborated with ETL teams, both client and IBM.
• Analyzed on-premises Informatica-based ETL solutions and designed ETL solutions using Azure Data Factory pipelines, Azure Databricks, PySpark, and Spark SQL.

Customer Profitability Insights (CPI)

The Business Banking Customer Profitability (BBCP) project aims to develop a new profitability analysis platform for business banking and expand its usage from the over $5 million credit segment to all client credit segments.

Contribution
• Developed solutions in a highly demanding environment and provided hands-on guidance to other team members.
• Headed complex ETL requirements and design.
• Implemented Informatica-based ETL solution fulfilling stringent performance requirements.
• Collaborated with product development teams and senior designers to develop architectural requirements to ensure client satisfaction with the product.
• Assessed requirements for completeness and accuracy.
• Determined if requirements are actionable for the ETL team.
• Conducted impact assessment and determined the size of effort based on requirements.
• Developed complete SDLC project plans to implement ETL solutions and identify resource requirements.
• Performed an active, leading role in shaping and enhancing overall ETL Informatica architecture.

Achmea Solvency II

This project aims to establish a revised set of EU-wide capital requirements and risk management standards that will replace the current solvency requirements. It consists of four releases.

Solvency II enforces that all material risks of an insurer need to be more transparent in such a way that it can calculate what capital needs to be kept as coverage for unforeseen circumstances. Driven by these requirements and legislation, Achmea started the Value Management program.

A vital program result is the realization of an automated reporting facility by an integrated actuarial data warehouse.
• Release-1: Life 400 insurance
• Release-2: Non-life insurance
• Release-3: ALI/AMIS
• Release-4: VITALIS

Contribution
• Headed in practical knowledge transfer sessions with modelers.
• Led technical design meetings for designing individual layers.
• Analyzed functional design documents and prepared analysis sheets for individual layers.
• Extensively worked on technical design generation set of documents and amended as suitable for the current release.

Data Analyst – Azure Data Factory Expertise

I was a senior data engineer with analyst skills working on ETL architecture solutions, Requirements assessments, and designing suitable data flows or batches. Also, I performed solutions optimization and end-to-end data pipelines with data integrity.

Skills

Languages

SQL, Python, T-SQL (Transact-SQL), Python 3, Snowflake, XML, C, C++, Pascal, R, Scala

Frameworks

Apache Spark, Spark

Tools

Informatica ETL, Autosys, Tableau, Postman, AWS Glue, Amazon Elastic MapReduce (EMR), Apache Airflow, AWS IAM, Amazon CloudWatch, Amazon QuickSight, GitHub, Excel 2016, Microsoft Power BI, Microsoft Excel, Control-M, Google Analytics, Power Query

Paradigms

ETL, Dimensional Modeling, Business Intelligence (BI), OLAP, Kanban, Database Design, DevOps, Data Science

Platforms

Oracle, Azure, Amazon Web Services (AWS), Databricks, Amazon EC2, AWS Lambda, Google Cloud Platform (GCP)

Storage

Netezza, IBM Db2, Database Management Systems (DBMS), Data Pipelines, Relational Databases, Databases, PostgreSQL, SQL Stored Procedures, Data Integration, Database Architecture, Oracle PL/SQL, NoSQL, Database Transactions, MySQL, Azure SQL Databases, Azure SQL, JSON, Data Lakes, PL/SQL, Microsoft SQL Server, Redshift, Amazon Aurora, AWS Data Pipeline Service, PostgreSQL 10.1, Amazon DynamoDB, Database Administration (DBA), Database Migration, Database Structure, OLTP, Amazon S3 (AWS S3), Datadog, Elasticsearch

Other

Unix Shell Scripting, Informatica, Data Engineering, Pipelines, Data Management, Data Architecture, Migration, Query Composition, Data Warehouse Design, Data Warehousing, CSV File Processing, CSV, ETL Tools, Scripting Languages, Orchestration, Technical Project Management, ETL Development, Data Transformation, Normalization, Shell Scripting, Architecture, Data Analysis, Data Analytics, Database Optimization, PL/SQL Tuning, Data Build Tool (dbt), DAX, Transactional Data, Data, Azure Data Factory, Azure Databricks, Azure Data Lake, Azure Synapse, Azure SQL Data Warehouse (SQL DW), Informatica Cloud, Data Modeling, Query Optimization, Performance Tuning, Analytics, XLSX File Processing, Data Migration, APIs, Data Wrangling, Amazon RDS, Excel Macros, Lambda Functions, Big Data Architecture, AWS Cloud Architecture, CI/CD Pipelines, Webhooks, BI Reporting, CDC, Data-driven Dashboards, Business Services, Identity & Access Management (IAM), Azure Data Explorer, Consulting, Machine Learning, Google Analytics 4, Big Data, Data Visualization, Microsoft Power Automate, Real Estate, Geospatial Data, AWS Certified Cloud Practitioner, Microsoft Azure

Libraries/APIs

Pandas

Education

1998 - 2002

Bachelor's Degree in Technology and Electrical Engineering

Jawaharlal Nehru Technological University - Hyderabad, India

Certifications

JUNE 2023 - JUNE 2026

AWS Certified Cloud Practitioner

AWS

DECEMBER 2021 - DECEMBER 2022

Azure Data Engineer

Microsoft

AUGUST 2021 - PRESENT

Microsoft Azure Fundamentals

Azure