Yandex school of Data Analysis
It’s been 5 years since I’ve graduated from the school…
Wait, but you could ask me — what it is and why it’s important.
Let me introduce you the best additional education at the intersection of science and technology. It’s not a classical school, it is a two year program, where machine learning and data analysis are deeply taught. Well, maybe it’s not the best in the world, but this school (or simply, SDA), without doubt, defined my future and passion for applied science.
By level of knowledge and complexity, it is compared with a master’s degree in computer science. Classes are held in the evening, which helps to combine with university, but because of the load, you have to prioritise what is more important to you.
Even though it’s a Russian school, there is one branch in Israel and I bet Yandex will open more across Europe.
Two reasons prompted me to write this article:
- to show that there are always opportunities around us to evolve, even if you are not lucky enough with university
- to demonstrate how this education is structured, that can help to answer a question why so many other courses (online or offline) are a complete deception.
History of SDA
It was launched (and is funded) by Yandex, the most famous and technological IT-company in Russia, that created a search engine (one year earlier than Google) and turned into a giant company, with delivery and taxi services, own marketplace, a voice assistant and more.
The idea to create this school was to fill the gap in knowledge of students from Computer Science specialties. Substantially, it is a factory, where Yandex builds their future workers with a firm background in CS. In the 2000s, when Yandex flourished, it was hard to find excellent specialists in the labor market and the need to solve problems related to data processing (text, images, music, voice) was constantly growing.
Yandex opened the school of Data Analysis in 2007. Out of 80 students who were selected in the first year, only 36 graduated. Since then, there were opened branches in various cities across Russia in cooperation with local universities.
All these years from 2007 Yandex has constantly experimented on programs. In 2016, when I was admitted, I chose “Big Data” program out of 3 presented, but now they have 4 completely different:
- Machine learning developer
- Data science
- Big data infrastructure
- Data analysis in applied sciences
The last thing, that is worth to mention, this education is totally free. In order to studying there, you should only pass entrance exams. There is an option, though, if you did not pass the exams as well as others, you will be offered to pay a part of the studying.
Programs and courses
in the year I entered SDA, there were three domains:
- Data analysis. With in-depth study of theory and math.
- Big data. There were courses aimed at working with big data and tools for it.
- Computer science. In-depth study of programming languages and software architecture.
I, certainly, picked Big data. It still amazes me how a line of code can be executed consistently on multiple hosts at the same and process petabytes of information and how even an elementary algorithm can change with such a wealth of data.
Programs only defines 3 required courses, that you should pass, nonetheless, you are free to attend as much as you want.
Although, the website estimates your work as 30 hours per week, I bet you will spend much, much more. During these 2 years, I literally spent all of my free time on the school (and even the time that was set aside for sleep). You can find yourself dreaming about tasks. It was harsh, but it was worth that feeling, when you have finally found a solution after countless hours spending in front of a computer.
As a result, I finished 14 courses (although, it was required to finish 12 of them: 3 courses per semester) related to Machine learning, Big Data and Computer Science (we studied C++ and Python on some other level. Several years ago Go language was added as well to keep up with the trend).
My favourite courses are:
- Natural language processing, or simply, NLP. In my opinion, the best application of Machine learning so far. The place, where strict models and statistics work. From embeddings and N-gram language models to BERT and GPT-3 through Transformer models. If you want to know more about the course, there are open materials from it (eng): click 1, click 2.
My life turned out differently, but I almost connected myself with PhD and science just because of this course. - Machine learning. Doesn’t need an introduction. Nonetheless, unlike many similar courses online, it connects math with computer science in a hard way. To pass the course you have to know math behind frameworks. For example, how ridge regression works and why you need to know about SVD (Singular Value Decomposition).
- Distributed systems. You will learn about consistency, distributed data processing, map-reduce. We had some practise with hadoop and spark, for which I am very grateful, because it helped me at my next job.
No matter which program you choose, you will have mentors for every course. That being said, in addition to lectures, every week you are given a homework, practise (math exercises, code or jupyter notebooks), that you should complete and all of it is diligently checked by Yandex employees or course teachers. Shame on you, if you can pass a review after less than 5–6 push backs :).
What I like the most is that during the second year you are given a chance to shine. You will be offered to do not regular tasks, but projects, that can impact the world. For example, at the NLP course you are assigned an open problem and some students, who were lucky enough, publish articles about the work. At another course you can develop an open source product, such as catboost or earn medals from Kaggle competitions.
How to prepare
The motto of the school hints: “It will be hard, you will like it”.
You should prepare in advance. Personally, I spent about 2–3 months brushing up my knowledge and solving math tasks from previous entrance exams in the evening after university classes.
School entrance exams consist of 3 steps.
First, there is an online tour. It is an online contest, by structure, similar to google’s coding competition, where you should solve several tasks (it varies, but there are around 10 of them) in 5 hours. These tasks challenge you ability to solve algorithmic and math problems.
Everyone can participate in the contest. These tasks are not so hard to solve. A goal of this contest is to thin out a huge number of participants.
Next step is an offline contest. You should definitely prepare for it. It’s more like a math exam, than an algorithmic one. You are given 8 tasks and 4 hours to solve as much as possible. Here are two examples:
Ex.1. On a plane tiled with identical rectangles with sides 10 and 20 (rectangles adjoin sides), draw a random circle of radius 4. Find the probability that the circle has common points with exactly three rectangles.
Ex.2. Is there a continuous function f(x) for which f(f(x)) = 1 — x³?
I like, that at this point, Yandex doesn’t ask you to know programming languages or computer science. You just should think and reflect well.
Then, the last step is to pass an oral interview with curators and teachers of the school. There you should show your motivation and explain why you want to study there. Sometimes they ask algorithmic questions and other staff, but the most important part is your motivation. There you can explain your experience, hobbies, passion for math, science and so on. Don’t underestimate it. I know people, who perfectly solved the offline contest, but they didn’t show motivation or interest. As a result — they were not accepted.
Before attending an interview, I learned what programs they offer and have already chosen specialty, before being selected. It shows dedication and awareness of the school.
Like everywhere, it is not enough to solve every task, dedication and consistency defines success.
The difference from other courses
Probably you’ve already noticed a growing trend of school offerings that promise to learn the basics of programming in three months or so. Let’s face the truth — most of these courses leave you with nothing. Their formula is simple: you are paying — you are studying. There are no admission tests, people are gathered with inconsistent knowledge from different areas. Classes are overcrowded, that is, one teacher can be assigned to 20+ people.
In case of SDA, there is a careful selection of students with a similar background. Moreover, because exams require a lot of time and effort, it motivates you to study hard. It already defines 50 percent of success — students can simply learn from each other.
Courses are intense. In contrast to university, where there is only one exam at the end of the semester, you are required to study and solve tasks every week.
Yandex is interested in refining employees for themselves, so they invest a lot in training as well. Moreover, studying in SDA gives you an opportunity to have an internship at Yandex.
What I learned
This school gave me a chance to study and enjoy programming and math. I acquired the knowledge of machine learning and its applications. It doesn’t replace experience, a real practise (obviously, as any course), but it assists in finding the right direction of solving a particular problem.
It taught me to never underestimate human capabilities. It happened, that during a couple of months, I recall myself studying at university and at SDA and having a job. It forced me to prioritise and be selective.
After studying there, almost all doors in IT are open for you.
For me, SDA gave me a chance to find my preferences. I chose a direction in what I really want to grow. The school has developed in me a passion and a skill to work with data and the ability to use algorithms on them to solve business problems.
In spite of the fact, that it might seem I adore the school, there are several drawbacks.
Whilst the education builds a solid foundation for a career, these courses don’t help you to see how companies operates with data. In reality, much more often, you don’t even need ML, sophisticated algorithms to solve a business task. After the study you are armed with algorithms, knowledge, tools, but it requires time to develop expertise, common sense, since the most important thing for business is practical applications, not the beauty of algorithms under the hood.
In addition, I would add more complex projects, to have them not only on the second year. Such as — to build a complete data-driven application or to conduct a scientific research.
Overall, I’m glad, that I was able to be one of the students. You always can learn by your own and from my point of view it’s more valuable, but SDA provides you with conditions, under which you feel more inspired and motivated.