Конференция завершена. Ждем вас на Highload++ в следующий раз!

Москва, СКОЛКОВО
8 и 9 ноября 2018

support@ontico.ru

@ontico_support

+7 (495) 646-0768

Building Resilient Data Pipelines in GoБазы данных и системы хранения

Доклад отклонён

Grant Griffiths

GE Digital

I’m a Gopher of 3 years working as a Senior Software Engineer at GE Digital. I’m very passionate about Go, running the Go User Group at GE. We have bi-weekly talks with anywhere from 50-200 Gophers depends on the topic, with internal and external speakers such as Daniel Whitenack, Wally Quevedo, Joe Beda, and soon Francesc Campoy. I’ve spoken at Gophercon UK and also locally in at the GoSF meetup in San Francisco.

At GE Digital, I work in the Predix Cloud Engineering org, where I build Data Services in Go that store and process Industrial IoT data with Kafka, Cassandra, EMR/Spark, and much more. In addition to writing backend services, I also work on Site Reliability Engineering for my team, improving the monitoring/alerting, reliability, and performance for all data services. Specifically, I’ve built tooling that can simulate end users for black box testing.

I studied Computer Science and Mathematics at Syracuse University and like climbing rock, ice, and snow in my free time.

Тезисы

The modern world runs on Data. In this talk we will cover how Gophers of any level can easily build Data Pipelines in Go with Kafka and Cassandra. At the end, we will look at how GE has written a Data Pipeline in Go that can handle over 800,000 writes per second of industrial time series data.

Introduction to Data Pipelines:
- What are data pipelines
- Why Go is a good language for them

Package structure:
- How to lay out our data pipeline’s packages
- How data will flow throughout the application
- Example code

Writing integration tests:
- Using docker to write integration tests
- Simulating downtime using docker pause on Kafka or Cassandra
- Example code

Data source: Kafka:
- What is Kafka and how we can use it with Go
- How to ensure no data loss is possible with good offset management
- Reading from multiple Kafka partitions with a high level consumer
- Example code

Performing ETL on the data and business logic:
- Parsing data
- Data ETL
- Performing intermediary business logic
- Example code

Persistent Data Storage: Cassandra:
- Setting up gocql to write to Cassandra
- Best practices for writing to Cassandra
- Example code

Demo: Processing hundreds of thousands of message:
- Finished version of our demo application
- Running in a Kubernetes cluster at scale
- Killing components, seeing how it recovers
- Finished example code

Example use case: Go Data Pipeline at GE Digital:
- Example data pipeline that’s running in production at GE Digital
- Production results of a similar data pipeline with over 800,000 writes per second

Базы данных / другое

,

Масштабирование с нуля

,

GO

Другие доклады секции Базы данных и системы хранения

Как Tinkoff.ru использует Greenplum

Дмитрий Немчин

Tinkoff.ru

Базы данных в облаках

Владимир Бородин

Yandex Cloud

MyRocks deep dive and production deployment at Facebook

Yoshinori Matsunobu

Facebook

Как устроить хайлоад на ровном месте

Олег Бартунов

Postgres Professional

Hadoop at scale: мы построили большой кластер, как его теперь сохранить?

Сергей Корсаков

Onfido

Переезжаем в облака: опыт миграции 10 TB PostgreSQL-кластера на AWS

Александр Кукушкин

Zalando SE

The cost of MongoDB ACID transactions in theory and practice

MongoDB

Яндекс.Метрика и нестандартный ClickHouse

Александр Макаров

Yii

Will Postgres Live Forever?

EnterpriseDB

Один из вариантов реализации Data Discovery в микросервисной архитектуре

Николай Голов

ManyChat

Инструменты создания бэкапов PostgreSQL

Андрей Сальников

Data Egret

Забиваем телескопом гвозди, или Нестандартные способы использования ClickHouse

Александр Зайцев

LifeStreet, Altinity

Выбираем систему репликации для PostgreSQL

Виктор Егоров

DataEgret

Continuous Optimization for distributed BigData analysis

Treasure Data Inc.

Последние изменения в IO-стеке Linux с точки зрения DBA

Илья Космодемьянский

Data Egret

Software Defined Storage the Linux way

Philipp Reisner

LINBIT

Как снять бэкап в распределенной системе, чтобы этого никто не заметил

Иван Раков

GridGain

В Tarantool 2.1 появилась поддержка SQL: подробности

Кирилл Юхин

Tarantool

Анализ производительности запросов в ClickHouse

Алексей Миловидов

Яндекс

ClickHouse тормозит

Кирилл Шваков

Kinescope

PostgreSQL 11 и далее: обзор новинок и тенденций

Иван Панченко

Postgres Professional

Demystifying MySQL Replication Crash Safety

Jean-François Gagné

MessageBird

MariaDB и MySQL — какую статистику использует оптимизатор, или Как обойтись без индексов

Сергей Голубчик

MariaDB Corporation

Построение аналитического хранилища на 100 петабайт

Александр Мазуров

Criteo

Руководство по выживанию с MongoDB

Сергей Загурский

Joom

MySQL 8.0: SQL and NoSQL Scalability

Oracle - MySQL

Эксперименты с Postgres в Docker и облаках — оптимизация настроек и схемы вашей БД без риска «уронить прод»

Николай Самохвалов

Postgres.ai

BBM’s 150M+ users Oracle to Postgres migration without downtime

Álvaro Hernandez

OnGres

Топ ошибок со стороны разработки при работе с PostgreSQL

Алексей Лесовский

Coins.ph

"Заряжай" или CDC из MariaDB и Postgres в аналитическую СУБД MariaDB Columnstore

Роман Ноздрин

MariaDB Corporation

Make Your Database Dream of Electric Sheep: Designing for Autonomous Operation

Carnegie Mellon University

Репликация в Tarantool: конфигурация и использование

Георгий Кириченко

Mail.ru

Масштабирование реплик PostgreSQL под нагрузкой с точки зрения технологий резервного копирования

Андрей Бородин

Yandex Cloud

Как стать классным спецом по базам данных?

Илья Космодемьянский

Data Egret

Apache Kafka как основа для велосипедостроения

Николай Сивко

okmeter.io

Репликация между разными СУБД: для чего мы написали репликатор MySQL-Tarantool

Михаил Буйлов

Мамба

Место row level security в высоконагруженном проекте

Александр Токарев

Xsolla

VShard - горизонтальное масштабирование в Tarantool

Владислав Шпилевой

Ubisoft