I eliminate work that doesn't deserve human attention. Raw data and manual processes become predictable AI services and automation — in weeks, not months.
If an action repeats without thinking — it’s a defect, not work.
Data shouldn’t be moved by hand. Reports shouldn’t be assembled manually. Routine questions shouldn’t cost a specialist’s time.
In 22 years the tools have changed — ETL pipelines, orchestrators, AI agents — but the principle hasn’t: every process that repeats without thinking can and should be handed to a system. The sharper this line is drawn, the more people work on what actually requires a human mind.
Fixed 2-week sprint. I identify pipeline bottlenecks, failure points, and data quality risks. You get a prioritized roadmap and target architecture: what to fix first and how long it takes.
4–6 weeks from hypothesis to test production. Scoring, recommendations, forecasting, RAG-powered documentation assistant — I take one problem and deliver a measurable result you can show to the business.
End-to-end data flows between CRM, ERP, SAP, analytics, and external services. Normalization, cleansing, routing — so data doesn’t get lost in transit, reports add up, and decisions come faster.
Internal AI tools for your team: support chatbots, ticket classification, document summarization, automated reports. Each of these processes repeats without thinking — which means a human shouldn’t be doing it.
A major industrial group with multiple manufacturing divisions across Russian regions. Data from SAP, MSSQL, MySQL, and PostgreSQL existed in silos, reporting was assembled manually, and feeding parameters into the SCADA system required complex business logic for transformation and validation.
Designed and deployed a centralized ETL platform: Apache NiFi for orchestrating flows between sources, Airflow for scheduling batch jobs, Kafka for real-time event streaming. Built a normalization and cleansing layer, plus an experimental transformation aggregate for preparing data for SCADA with complex routing and validation at every stage.
Unified data flow across all divisions. Reporting time — from days to hours. Manual reconciliation eliminated entirely. The SCADA subsystem now receives verified data automatically.
A government agency in a CIS country needed continuous social media monitoring: collecting and storing graph data (user connections, communities, audience overlaps), identifying influence clusters, and tracking the emergence of new opinion leaders in near real-time.
Deployed a Cloudera Hadoop cluster (HDFS, YARN, Hive, HBase, Spark) for storing and processing large data volumes. Crawling was implemented in Python (Scrapy) with distributed task scheduling. Graph structures stored in Neo4j, cluster analysis and opinion leader ranking via Spark GraphX. Automated analytical report generation on a schedule.
The system processes millions of nodes and edges. New opinion leaders are detected automatically within 24 hours of reaching critical mass. Analytical reports are generated without manual intervention.
The client needed regular data collection from dozens of external web sources with aggressive anti-bot protection. Data had to be normalized, deduplicated, loaded into Hadoop and the client's internal systems, and made available through management dashboards.
Built a distributed crawler with rotation through a pool of proxy providers, adaptive algorithms for bypassing rate limits and CAPTCHA, and intelligent request throttling. ETL pipeline: cleansing and normalization → loading into HDFS/Hive → data marts for BI. Management dashboards in Superset. Legal and logistical aspects of data collection were addressed.
Stable automated collection from 50+ sources — zero manual steps. Data available to analysts via dashboards and to ML engineers via Hive/Spark within hours of appearing at the source.
The client's technical documentation consisted of thousands of PDF pages in English: regulations, equipment specifications, and operating manuals. Field engineers and operators work in Russian and spent hours searching for the right sections. Critical requirement: zero hallucinations — answers strictly from the documentation text, with no free interpretation allowed due to the domain specifics.
Built a RAG platform: PDF document loading and parsing with structure preservation (sections, tables, diagrams), chunking with semantic boundary awareness, and indexing into a vector database. The retrieval layer uses hybrid search (semantic + keyword) for precise extraction of relevant fragments. Response generation via GPT-4 with strict prompt engineering: the model answers only based on retrieved fragments, every claim includes a reference to the specific document, section, and page. Cross-lingual capability: question in Russian → search across English documents → answer in Russian with citations from the original.
Information lookup time dropped from hours to seconds. Employees receive precise answers with direct source references; hallucinations are eliminated at the architecture level. The platform is used daily by multiple departments.
Russia's unified federal e-government portal. Development of several high-load backend subsystems with strict security requirements (including GOST cryptography) and integration with dozens of government agency systems.
Led the development team, which I built from scratch and grew to 15 people. Stack: Spring, MyBatis, Oracle, RabbitMQ, CXF (SOAP integrations with government agencies), CryptoPro for GOST encryption. Introduced Scrum and CI processes.
Subsystems launched in production, serving tens of millions of citizens. The team continued operating after the handover.
1–2 weeks. Data, ETL, and infrastructure audit. You get a current-state map and clear priorities.
Target architecture, risk assessment, cost and timeline estimates. You know what you’re paying for.
Rapid prototype on real data. A result you can show to the business.
Deployment, monitoring, documentation. Your team can operate independently.
Iterations, evolution, knowledge transfer. Dependency on me decreases every month.
30–40 minute call → we identify bottlenecks → proposal with timeline and cost estimate. Even if we don’t start — you get a fresh perspective on your processes.
Discuss your project