Конференция завершена. Ждем вас на Highload++ в следующий раз!
Москва, СКОЛКОВО
8 и 9 ноября 2018

Building useful and scalable Data Lake BigData и машинное обучение

Доклад отклонён
Guy Cohen
Gett

I am working in the BigData area for almost 5 years with Hive, s3, Spark and Presto. I have established the Data Lake of Gett 3 years ago and I keep on developing it since then. Today, as the BigData Architect at Gett, I am experiencing challenges with Data and I would love to share them with my colleagues.

Тезисы

There are many options to collect and store data. Using RDBMs works fine for small and medium datasets with well defined schemas. But what if you also have very large datasets with a variety of mixed schemas? How can you store them once and use them later for all your analytic use cases?

Building useful and scalable Data Lake:
- Architectural considerations for building scalable Data Lake
- Smart ways to make your Data Lake useful
- How to calculate business KPIs from raw data with Presto

Базы данных / другое
,
Big Data и Highload в Enterprise

Другие доклады секции BigData и машинное обучение

Rambler's Top100