I am a software and data engineer who started a professional IT career in 2016. Within data engineering, I have good experience working with Apache Spark, Delta Lake, and Databricks as a platform. I also love and use functional programming in most of my projects; Python's mypy, software architectures, distributed systems, and concurrent programming. My main language is Python, but also I work in Scala.
I had various and very different experience with building Data Lakes (primary Databricks or Delta Lake on hdfs), ELT pipelines, Testing tools, CI/CD systems, clouds (mostly AWS), RHEL, Centos.
I am big advocate for software testing. Generally, I am working on solving complex problems, building system architectures and frameworks.
My last three projects are used by many people and keep evolving even if I dropped maintaining them.
I prefer reading books and get a deep understanding of technologies.
I am maintaining multiple OSS projects and the latest OSS project I've built is pramen-py https://github.com/AbsaOSS/pramen/tree/main/pramen-py. This is a framework enabling a convenient way to define spark transformations pipelines.
One other project that addresses the testing convenience is https://github.com/zhukovgreen/pytest-when
But one of my favorite which is small but very powerful is https://github.com/zhukovgreen/friendly-sequences. This is type-safe library for functions chaining.
Within Paylocity I built many interesting frameworks for the data quality systems, feature store and declarative schema evolution framework, new CI/CD system and many other interesting and useful projects.
- My talks repo - https://github.com/ZhukovGreen/talks
- PyCon CZ 2023 - Can we have a better feature store https://cz.pycon.org/2023/program/talks/65/
- DevConf.US 2021 - Framework for integration tests lifecycle https://www.youtube.com/watch?v=K7VcLnHRz0w&list=PLU1vS0speL2ZbTPg-aU2Rw2s6IPsTVoCF&index=60
- Using asyncio for building cli applications (PyAmsterdam 2020) - https://py.amsterdam/2020/10/15/virtual-pyamsterdamnowtzzoneinfoeuropeamsterdam-stayathome.html
- HVAC engineer and Python (Pycon CZ 2018) - https://youtu.be/KAZn2Fhh7f4?t=324
Dates Employed Aug 2022 - present
Building data platform for data scientists.
Databricks on AWS is the main platform. Pulumi / Terraform for infrastructure. Python is the primary language. Spark and Delta Lake.
Scope of work:
- setting standards in the team software development practices
- building data platform
- enable data quality system
- data transformation framework
- automatization of databricks pipeline deployment
- feature store and data contracts, tables metadata management
- schema evolution
- data transformation framework
- queries optimization
- streaming batch data processing
Many other things in the area of big data, delta lakes, data ingestions, data catalogs, etc.
Dates Employed Aug 2021 - Aug 2022
Developing and building components of on-premise system for convenient big data ETL processes, together with abstractions around data warehouse for the data scientists (feature centric interfaces). Apart of these work, I was developing different data transformations for different projects, and maintaining one existing project.
Dates Employed Jul 2021 - Aug 2021
Working in the Convert2RHEL team. Designing the simple, but specific and reach CI system for developing and running integration tests (libvirt, ansible, testing farm, tmt). Developing new features in the upstream project, code review. Adopting pytest, transition the app to python3, mentoring.
A note from the promotion document:
Artem joined Red Hat during the spring of 2020 as a software developer working on
the LEAPP team. Artem’s enthusiasm for Python and pythonic development practices
soon led him to adopt an advocacy role on his team.
Artem transitioned to the Convert2RHEL team in early 2021, and rapidly became one
of the team’s most prolific contributors. He continued to broaden his reach beyond
his SST by creating a hardware deprecation database and associated microservice
which helps to take the mystery out of hardware support.
Artem’s creativity and energy have made him a true asset to his teams.
Dates Employed Apr 2020 - June 2020
I am working in OS & App modernization team (OAMG)
Primary responsibilities are:
- Maintaining contributing to OAMG repositories https://github.com/oamg
- Developing a data delivery system ( internal framework to distribute various data to its clients )
- Working on convert2rhel utility (new features and the CI)
Dates Employed Jan 2016 – Jan 2020
Building a software platform to support new products and company processes.
Dates Employed May 2006 – Aug 2016
I was working in a variety of positions within the HVAC industry
- Compact Air Handling Units (AHU) project manager (~ 1 year)
- AHU technical support (~1 year)
- HVAC designer (~5 years)
- Energy modeler for LEED certification (~ 1 year)
- Technical supervisor on site (~1 years)
- Ventilation systems installer (~1 year)
Degree Name Online Education Field Of Study CS229: Machine Learning Grade NA Dates attended or expected graduation 2016 – 2017
I passed through all lecture videos and keynotes, resolved all assignments. Course syllabus: http://cs229.stanford.edu/syllabus.html
Degree Name Nano-degree Field Of Study Machine Learning Grade Nano-degree Dates attended or expected graduation 2016 – 2018
https://www.udacity.com/course/machine-learning-engineer-nanodegree--nd009
Degree Name Master’s Degree Field Of Study Mechanical Engineering (HVAC) Grade M.Sc. in heating, ventilation, air conditioning systems Dates attended or expected graduation 2002 – 2008
This is my primary base education. A lot of mathematics, physics, and drawings.
- Udacity: PyTorch Scholarship Challenge from Facebook
- AWS trainings (Big Data, Data Lakes, Developing with AWS)
- A vast amount of different courses at Udemy/Coursera, such as data structures and algorithms, functional programming, PyTorch Reinforcement learning, etc.
- Russian - native
- English - good professional level
- Czech - good professional level