Aniket Abhishek Soni

Aniket Abhishek Soni

Senior Data Engineer — Databricks · Snowflake · Cloud · AI/ML

Senior Data Engineer with 7+ years building enterprise-scale ETL pipelines, AI-driven analytics platforms, and cloud-native infrastructure at Cognizant. Researcher with 20+ published papers in AI, data engineering, and geospatial analytics — including a book on distributed systems. Beyond engineering, Aniket mentors aspiring technologists, contributes to climate tech through geospatial modeling, and is recognized as a speaker, judge, and thought leader at global AI and data events.

Current Role Sr. Data Engineer @ Cognizant
Published Papers 20+ peer-reviewed
Citations 138+ (RG) · 109 (Scholar)
Years Experience
0
in data engineering & cloud
Records / Day
10M+
processed at enterprise scale
Associate Cloud Engineer
Google Cloud Certified
Data Engineer Associate
Databricks Certified
Senior Member IEEE (SMIEEE) · Fellow SCRS · Royal Fellow IOASD · Life Member CSI · Upsilon Pi Epsilon · Sigma Xi Affiliate · GDG New York · 20+ peer reviews · IEEE Day 2025 Ambassador
Currently
New: AI-Enabled Serverless Engineering — Technical Press, 2026 (co-authored)
Book chapter — Wiley-Scrivener, Digital Twins & GenAI (in press)
138+ citations (RG) · 109 (Scholar) · h-index: 4
0Years Experience
10M+Records / Day
20+Papers Published
138+Citations (RG)
4h-index
3+Press Features
9+Judging Roles
150M+Climate Data Points
Personal Statement

About

Aniket Abhishek Soni is a Senior Data Engineer based in New York with over seven years of experience designing enterprise-scale data infrastructure, AI-powered analytics systems, and cloud-native pipelines. At Cognizant Technology Solutions, he has architected solutions processing over 10 million records per day across regulated industries — building the kind of reliable, auditable, and high-performance data foundations that make AI strategies viable in the real world.

Beyond his engineering role, Aniket is an active researcher and published author. His body of work spans 20+ peer-reviewed papers across AI, data engineering, distributed systems, and geospatial analytics — including a sole-authored book published by BP International and a forthcoming book chapter with Wiley-Scrivener. His research has been cited 138+ times and is indexed on Google Scholar, ORCiD, and ResearchGate. He has presented at international conferences across Scotland, Bahrain, and online venues, contributing findings that bridge academic rigor with practical engineering impact.

Aniket also brings a distinctive lens through his work with climate technology. As a Research Analyst at Climate Change Xplorers, he analyzed over 150 million global data points to identify optimal locations for IoT weather station deployment — a project that combined geospatial modeling, machine learning, and big data integration in service of global climate action.

Recognized internationally for sustained contributions to the technology community, Aniket serves as a judge, peer reviewer, speaker, and mentor. He has evaluated work at MIT, IEEE conferences, the SIIA Neal Awards, and Major League Hacking — roles that place him among the small group of professionals called upon to assess the work of their peers. He mentors emerging engineers across three continents through FEA India, ENGin Ukraine, and TopMate.io.

Aniket Abhishek Soni
Aniket Abhishek Soni
Aniket Abhishek Soni
Areas of Expertise
Enterprise Data Architecture AI/ML Integration Cloud-Native Systems Distributed Systems Research Climate Tech & Geospatial Data Governance
🌍 Climate Research

Geospatial modeling of global weather station deployment using NOAA, FSI, MADIS, and IBM Watson data — supporting climate action through data-driven infrastructure decisions.

150M+ data points 151 locations 8 FSI categories
Certifications
GCP Associate Cloud Engineer GCP Cloud Digital Leader Databricks DE Associate
Technical Stack

The Pipeline

Languages
Python
SQL
PySpark
Java
Go (Golang)
Scala
Platforms
Databricks
Snowflake
Delta Lake
PostgreSQL
MongoDB
SQL Server
Pipelines
Apache Airflow
Spark Streaming
ETL / ELT
Data Observability
CI/CD Pipelines
GitLab / GitHub
Cloud
AWS (S3, Glue, EC2)
GCP (ACE Certified)
Azure / OneLake
Docker
Kubernetes
IAM / KMS
Analytics
Power BI
Tableau
Matplotlib / Seaborn
Folium / Geoplot
KPI Dashboards
Scikit-learn
Domains
Financial Data
Geospatial Analytics
NLP / Speech AI
Climate / NOAA
Data Governance
Healthcare Data
Career History

Experience

aniketsoni@root:~/career$
$ load_career --all --format=timeline Initializing career data... Found 5 roles across 3 industries Sorting by recency... ▸ Scroll to navigate timeline
Mar 2021 — Present
Cognizant
Senior Data Engineer · New York, US · Full-time
Designed ETL pipelines processing 10M+ records/day across financial services and healthcare
Improved data accuracy by 30%, reduced reporting errors by 37%
Built Slack + Databricks alerting cutting incident resolution by 60%
Implemented governance frameworks, improving onboarding efficiency by 30%
Mentored junior engineers, reducing onboarding time by 30%
DatabricksSnowflakePySparkPythonPower BISQLMongoDB
Sep 2020 — Mar 2021
Climate Change Xplorers
Research Analyst · New York, US · Part-time
Analyzed 150M+ data points to identify 151 optimal IoT weather station locations globally
Built ETL pipelines ingesting ~10K records/day from NOAA, FSI, MADIS, IBM Watson
Designed geospatial visualizations boosting rendering performance by 70%
PythonNOAA/FSIFoliumGeospatialPandas
Jul 2020 — Sep 2020
Converseon.AI
Software Engineer Intern · New York, US
Resolved 7+ JIRA tickets/week in a Python-Django computer vision platform
Built ETL cron-jobs ingesting Twitter & YouTube API data into ElasticSearch
Created FAQ documentation saving 4 dev hours/week
PythonDjangoPostgreSQLElasticSearch
Nov 2018 — May 2020
Southern Arkansas Univ.
Graduate & Teaching Assistant · Magnolia, AR
Maintained Honors College web server and developed academic webpages
Managed library database and hardware support at Magale Library
Provided front-line IT support to faculty and students
Web DevDB AdminIT Support
Jan 2018 — Jun 2018
Alok Enterprise
Computer Programmer & Data Analyst · Ahmedabad, India
Developed .NET applications supporting day-to-day business operations
Analyzed sales and inventory data in Excel to improve decision-making
Automated reporting tasks, reducing manual workload
.NETMySQLExcelHTML/CSS
Portfolio

Selected Projects

Live from github.com/aniketsoni1
Fetching repositories...
ML · Generative AI · Audio

Music Mood App

Built for Google AI's Music Mood App Competition. Developed a full ML pipeline using generative AI for mood-based music classification — achieving 85% accuracy, a 30% improvement over baseline. Designed an interactive UI that boosted user engagement by 40%.

85% ML Accuracy +40% Engagement
Aug 2024 · Google AI Competition
NLP · Speech · Python

Offline Speech-to-Text (Vosk)

High-accuracy offline transcription tool using Vosk + Python. Integrated KaldiRecognizer with pre-trained models, automated audio-to-text export as DOCX — eliminating 100% of manual transcription effort. Basis for an arXiv-published paper.

100% Manual Effort Saved Offline-First
View Paper ↗ GitHub ↗
May 2024 · arXiv:2503.21025
Web · Air Quality · Django

Air Quality Web App

Real-time AQI lookup by ZIP code built with Python-Django backend and Bootstrap frontend — providing live air quality data via an intuitive search interface.

Jun 2020
Data Analysis · Public Health · Python

COVID-19 Data Analysis

Pulled Johns Hopkins COVID-19 data, analyzed with Pandas, and visualized with Plotly + Folium. Compared countries by social distancing, government response, healthcare capacity, and testing feasibility during peak pandemic uncertainty.

Johns Hopkins DB Plotly + Folium
May 2020
NLP · Twitter API · ML

Twitter Sentiment Analysis

Live tweet ingestion via Twitter API → PostgreSQL → Scikit-learn Naive Bayes model. Achieved greater than 80% AUC across positive, negative, and neutral classes with manually annotated training data.

>80% AUC
Apr 2020
.NET · Healthcare · ERP

Medicines Stock Management System

Full-stack .NET application with prescription management, real-time dashboards, automated email/SMS alerts to distributors, and smart low-stock threshold detection.

May 2016 · Capstone Project
Academic Output

Research & Publications

Scalable Infrastructure book cover
AI-Enabled Serverless Engineering book cover
Type Title & Venue Year DOI / Link
Book
Scalable Infrastructure: Building Reliable Distributed Systems
BP International · ISBN 978-93-49970-00-7 · Sole Author
2025 10.9734/bpi ↗
Book
AI-Enabled Serverless Engineering for Autonomous Cloud Operations
Technical Press · ISBN 978-81-69069-56-4 · Co-author (with Milan Parikh)
2026 ISBN ↗
Book
Building Organizational Intelligence Using Digital Twins and Generative AI (Chapter)
Wiley-Scrivener Publishing · In publishing process
2025
Conf.
Big Data Workload Profiling for Energy-Aware Cloud Resource Management
DASET · Scotland
Jan 2026
Conf.
Reinforcement Learning For Dynamic Workflow Optimization In CI/CD Pipelines
17th IEEE CICN
Dec 2025 IEEE
Conf.
Data Foundations for a Successful AI Strategy: A Blueprint for AI-Ready Enterprises
DASA · Bahrain
Dec 2025
Conf.
Detection of Advance Malware Threats using Hybrid Deep Learning Model and Image Analysis
IC3 (International Conference on Contemporary Computing)
Aug 2025 10.1109/IC3… ↗
Conf.
Dynamic Context Tuning for Enhanced Multi-Turn Planning in Retrieval-Augmented Generation Systems
ICE2CPT
Oct 2025 arXiv:2506.11092 ↗
Conf.
Combining Threat Intelligence with IoT Scanning to Predict Cyber Attacks
AIBThings (3rd International Conference)
Sep 2025 arXiv:2411.17931 ↗
arXiv
Improving Speech Recognition Accuracy Using Custom Language Models with the Vosk Toolkit
arXiv [cs.SD]
Mar 2025 arXiv:2503.21025 ↗
Journal
Edge Vs Cloud Computing Performance Trade-Offs for Real-Time Analytics
IJSEA · Vol. 14, Issue 06
Jun 2025 10.7753/IJSEA ↗
Journal
Self-Healing Data Pipelines: A Fault-Tolerant Approach to ETL Workflows
IJETRM · Vol. 14, Issue 05
May 2025 zenodo.15615306 ↗
Journal
Dynamic Resource Allocation in Serverless Architectures using AI-Based Forecasting
IJERT · Vol. 14, Issue 04
Apr 2025 zenodo.18104614 ↗
→ Full list on Google Scholar · ORCiD · ResearchGate · Semantic Scholar
Media Coverage

Press & Articles

Press Coverage
Writing on Medium
Awards & Community

Recognition

2025
Rising Stars 30
The Channel Co. Computing
2024
40 Under Forty
AFCEA
2023
Young Achievers' Award
Indian Achievers' Forum
2022
Cheers Award
Cognizant
Speaking
Google DevFest Bronx · New York · 2023
"The Tech Mentor's Journey: Navigating Challenges and Shaping Futures"
Invited speaker at Google's community developer festival, addressing over 200 attendees on the role of mentorship in building the next generation of technologists. Google DevFest is one of the largest global developer event series.
Expert Judging & Peer Selection
Competition Judge
MIT 100k Pitch, Accelerate & Launch
One of the world's most prestigious startup competitions, hosted annually at MIT. Evaluated venture pitches, business models, and technical feasibility of early-stage companies.
Industry Awards Judge
SIIA Jesse H. Neal Awards (2024, 2025, 2026)
The "Pulitzers of Business Media," evaluating editorial excellence across specialized media and information industry content. Three consecutive years of selection as a judge.
Hackathon Judge
Major League Hacking (MLH) 2025
Evaluated technical projects at multiple MLH hackathons: Open Source, Web3Apps, AI Hackfest, and SummerCodex. MLH is the official student hackathon league, hosting 200+ events globally per year.
IEEE Peer Reviewer
20+ Reviews Across 8 IEEE Conferences
Peer review for ICETEG, RCAAI, ICoMMS, AGRETA, ISWTA, NMITCON, SSITCON, PuneCon (2025), and JMIR via PreReview.org — evaluating submitted research for technical rigor and academic contribution.
Industry Awards Judge
Business Intelligence Group — Sammy Awards (2025)
Evaluated business intelligence and analytics software, services, and solutions submitted to the Sammy Awards, recognizing innovation in the BI and analytics industry.
Ambassador
IEEE Day 2025 · AnitaB.org Grace Hopper #SpecSquad 2025
Selected as IEEE Day Ambassador (2025) and AnitaB.org Grace Hopper Celebration #SpecSquad Ambassador — representing the IEEE community and advancing diversity in technology.
Associate Editor
International Journal of Engineering in Computer Science
Editorial role evaluating and managing peer-review processes for submitted manuscripts in computer science and engineering research.
Industry Awards Judge
Globee Awards — AI, Business & Technology (2023, 2025)
Evaluated nominations across AI, business technology, and innovation categories at the Globee Awards, a global business awards program recognizing achievement across industries.
Professional Memberships
Senior Member IEEE (SMIEEE) Upsilon Pi Epsilon (UPE) PgUS Member Sigma Xi Affiliate Circle Computer Society IEEE AAAS Member Royal Fellow · IOASD Fellow Member · SCRS Life Member · CSI GDG New York Associate Editor · IJECS
Volunteering & Mentorship
Mentor · Freedom Employability Academy (India) Mentor · ENGin Ukraine Volunteer · Cognizant Outreach (100+ hrs) Mentor · TopMate.io
Academic Background

Education

Master of Science in Computer Science
Aug 2018 — May 2020
Magnolia, AR, USA
Bachelor of Engineering in Computer Engineering
Aug 2011 — Jun 2016
India
Certifications
Google Cloud Certified — Associate Cloud Engineer Google Cloud Digital Leader Databricks Certified Data Engineer Associate
Connect & Collaborate
GET IN TOUCH

Engaged in research collaborations, peer review, expert judging, speaking invitations, and mentorship across the global technology and data engineering community.