ATA People

Anomaly Detection with Richard

Andrej Spasovski

Andrej Spasovski

12 minutes read

Mon Feb 7, 2022

Anomaly Detection with Richard

Ataccama is growing, and so is the opportunity to make use of AI and machine learning to boost our platform capabilities. Ataccama’s AI research team is behind a lot of the smarts in our product. We sat down with Richard, our AI Researcher responsible for Anomaly detection to tell us more about AI in Ataccama, and specifically more about finding anomalies in complex data. With the recent growth of our business, our clients started asking more and more questions about data quality, and how we can help them make sure their data is correct and reliable.

Richard, hi! To kick things off, tell us about what got you to Ataccama.

Let’s start at the beginning. I finished my master’s degree in artificial intelligence in Prague, and I was deciding what to do next. I liked solving hard problems and finding answers to difficult questions, so that was my main motivator. Before finishing my master’s in Prague, I’d also spent a year in Texas, where I first started looking into research work in AI and machine learning. So, I knew that it was something I enjoyed, and I also wanted to see more of what it’s like to live abroad, and doing a PhD seemed like a perfect combination.

I chose to do my PhD in England because of the great universities, also it’s not that far from the Czech Republic. I applied and got a full 4-year scholarship. My scholarship allowed me to focus purely on research, although I also had the opportunity to convey knowledge by teaching. I made use of that opportunity to improve my presentation skills and learn how to convey complex ideas in an understandable way, which turned out to be very beneficial in my current role. In the UK, I learned how to think about problems and how to define them. I realized that when doing research it all starts with asking the right questions, which is often difficult. Only then can we start looking for a solution.

After finishing my PhD, I decided I wanted to apply my gained knowledge and experience in a more business-oriented environment. Therefore here I was, I started my journey with Ataccama where I am focusing on the anomaly detection domain.

How did that problem-solving line of thinking bring you to anomaly detection? What is it about anomaly detection that you especially like?

I like that it’s an easily defined problem, with great impact and high potential business value. A wide range of people can understand what we’re doing. Artificial intelligence problems are sometimes difficult to explain, it might be hard to understand why we should care about them. The anomaly detection domain, on the other hand, seems quite clear even for non-AI experts. Looking for anomalies, looking for something irregular? That’s very easy to explain. So that’s one thing.

Another thing I like is all the levels of complexity. You can start with very simple methods, where you just look for an average, and you see whether it’s too far from the average. And you mark that as anomalous. But of course, you can go deeper and you can have much more complex data, where you actually need to use advanced methods. And you need to think deeper about discovering potential anomalies in this kind of more intricate data. You also need to be aware of the limitations of your AI models and how feasible their implementation is for our platform. That’s something I enjoy, you need to make sure your solution works end-to-end and verify that it’s the right approach for the problem at hand.

And in Ataccama, are you finding the entire range of problems that make it so fun?

I think we’re getting there. We needed to build the infrastructure first because when I came to Ataccama, there was basically no anomaly detection in the Ataccama One platform. So we needed to first build the framework, and we basically built the AI core from scratch. There was a lot of engineering work at first. However, we leveraged AI into several domains of our platform and now we are starting to solve high-level problems, which require new innovative solutions and state-of-the-art AI algorithms. Right now we’re facing some pretty complex problems, like how to find anomalies in highly-structured data. Large amounts of data might need to be processed in real time, and it needs to be done fast. Ultimately, we want to inform the users in a comprehensive way whether there’s something wrong with the data — and ideally, suggest how to fix it.

As Ataccama is growing, you need help with that on your team. What kind of person would you like to join? What would your ideal new colleague be like?

We definitely need smart people. But also, we don’t want a one-track mind. We don’t want people who are super-technical but can’t really convey information, thus an ideal candidate can comprehensively talk to all members of the team and explain ideas well. Someone who is able to frame the problem we are trying to solve, propose an efficient solution and then make the outcome presentable. Therefore, we are seeking creative people with analytical minds who can come up with clever and innovative solutions and then explain the outcomes and implications to the team. To put it simply, we need people who can take end-to-end ownership of their work.

And when you find that person, and you’re going through the first couple of rounds, how would you win them over? What’s so cool about your team, that they would want to join and work with you?

I like the freedom of making my own priorities. It’s the one thing I was a bit worried about, when I was transitioning from academia to a business environment. Because in academia, of course, one of the main benefits of doing academic research is the freedom in defining problems and your fields of interest. You can choose to work on whatever you want — you’ll need to persuade a few people that it’s a good idea first, but it’s often a surmountable obstacle to clear. However, I got a feeling that this is, to some extent, the case in Ataccama as well.

When you’re working on something that you like — and I hope most people do — it’s great to have the power to impact the product. And to persuade the more business-oriented people that it’s a good idea. At first you might have just a rough idea, not knowing whether the design will work or not, or if there’s any business value in it, but you can very easily reach out to a designer, or to a consultant, or basically whoever, and they’ll help you refine it. You pitch your idea, and you can basically start working on it immediately. You get some feedback, and then you refine it. You’re very close to the entire end-to-end process. Let’s say you design something — in a couple of months, you can actually see it in the platform. And you get feedback fast, feedback on how the users and the clients actually use it. I think that’s definitely something great as well. You don’t just feel like a cog in a huge machine — instead, you actually feel like you have a direct impact on the product.

What’s your vision for the future? How do you see the team, the product, the company growing, and where, in what direction?

Our client list is growing fast. Therefore, as a company, we’re also growing rapidly — we’ve recently more than doubled our headcount to over 390 people. We definitely want to keep walking that path, so there’s a big emphasis on scaling everything. It’s the case with anomaly detection as well, along with any other features in the platform. It means we’ll be getting more data, from more clients. It’s a big challenge — and an even bigger opportunity — to extend our anomaly detection capabilities to support huge amounts of data.

We are interested in doing anomaly detection on all the different levels of our data handling process. Sometimes it’s interesting to zoom in and do anomaly detection on the record level, directly on the data, or you might want to zoom out and do it on the level of the metadata, like different statistics about the data tables or columns. This way, you could perform anomaly detection of the entire data process. There are certainly many challenges, which vary in complexity, so there’s plenty of room for devising more advanced and more complex anomaly detection techniques. That’s also why we need to grow the team.

We’re also trying to move towards interacting with the user. To learn their behavior and adapt the algorithms accordingly because, of course, one of the things with anomaly detection is that it is quite subjective. Something being anomalous can mean different things to different people. I think the route for us is to personalize the algorithm for different types of users, learn what they actually want and deliver rich, accurate results.

Another big challenge is to move beyond simple anomaly detection and into anomaly repair, together with the client. Our goal is not just to find the broken data — in the end, we want to fix it. Here I see a huge opportunity for our platform to really feel like an intelligent cooperative tool, which can lead the user and suggest what to do in order to achieve the best experience possible.

Let’s shift gears for a bit and get a little personal — we now know you’ve traveled and studied all over the world. So, how many languages do you actually speak?

That’s a very difficult question to answer, I basically keep learning and forgetting different languages in parallel. Apart from English, obviously, I studied French for many years. I then lived for a year in France, studying in French, so you can imagine I was fluent at the end. However after coming from France I didn’t study French anymore and slowly but surely I started forgetting it. When I lived in Texas I picked up Spanish - I was studying in El Paso, right on the border with Mexico, so it was a perfect opportunity to intensively study Spanish, with plenty of occasions to use it in practice. Two years ago I started learning Italian, since I have plenty of Italian friends and I very much like the culture and Italy itself. I don’t feel like I have any extra talent for languages but I enjoy studying them and through it, learning about other cultures. Also it helps me to activate other parts of my brain after work. :)

Which has been your favorite to learn? Why?

I really like Italian, you can have a basic conversation quite quickly after you start learning it. Also it greatly helped me that I knew some other Romance languages before, so I could quickly start having some basic interactions with my Italian friends. Also I like the sound of it and the fact that the pronunciation is quite easy, compared to, for example, French. :)

If you could only ski or only bike, which would you pick? Why?

It depends on the season, obviously. I always did many different sports, such as skiing or ice-skating in the winter. A couple of years ago we started ski touring with friends, mainly in the Alps, which I really enjoy. It gives you a real sense of freedom, where you can reach quite inaccessible summits in beautiful nature without any people in sight. Of course, it’s a good idea to bring your avalanche rescue set, just in case. :) When it gets warmer, I like running or riding my bicycle. I also used to rock climb. Now, we like to do via ferratas with my friends, mostly in the Dolomites. So to answer your question, I would pick one depending on the season.

Liverpool and El Paso seem like two different worlds — what did you like most about each? And what did you like the least?

They were different indeed. I lived in El Paso for a year and in Liverpool for 4 years. I really enjoyed and gained a lot of experience from both. I was on an exchange program in El Paso studying at the university, and also I was working as a research assistant there. El Paso is a very interesting place, right on the border with Mexico, more specifically with Ciudad Juárez, which we visited many times. Therefore, El Paso has a nice combination of American and Mexican culture, which was very interesting for me. I had a great time there, I traveled a lot and met a lot of nice people. One of the best things about El Paso is the weather, it’s always sunny, only a few cloudy days per year with 1 or 2 days of rain. However, it can get very hot during the summer.

On the other hand, Liverpool is very rainy, obviously, with not that many sunny days. I was doing my PhD in Liverpool, where I was mainly focusing on research, but I was also teaching and doing outreach activities for different audiences to advertise the university. Liverpool is a great city with the most friendly people, always eager to chat with you. However, one thing which was quite difficult at first was the accent — they speak Scouse in Liverpool, which took some time getting used to.

What’s your favorite place in the whole world?

I have been to some interesting places and I really enjoyed many of them, but I always felt the Czech Republic is my home. I’ve met a lot of Czechs who have felt the same way, even after many years abroad they eventually returned. So if I would need to pick a place, it would be the Czech Republic, where I have my family and friends. Also, the more countries I visited the more I realized how many amazing places we have here, and that there are many things we should be grateful for.

Interested in anomaly detection and tackling new challenges head on? Join Richard’s team!

Take a look at the open positions on our job portal and find the best role for you!