Live Transcript Delivery

57
Live Transcript Delivery Scalable real-time text streaming GRZEGORZ GIEDROJC GRZEGORZ KOLPUC

Transcript of Live Transcript Delivery

Page 1: Live Transcript Delivery

Live Transcript Delivery

Scalable real-time text streaming

GRZEGORZ GIEDROJCGRZEGORZ KOLPUC

Grzegorz Giedrojc
[email protected] Ten slajd chyba tez wylatuje? Bo rozumiem ze przchodzimy na pomaranczowe slajdy rozdzielajace czesci prezentacji?
Grzegorz Kołpuć
tak, chyba bedzie spojniej.
Page 2: Live Transcript Delivery

Grzegorz Giedrojc Grzegorz Kolpuc

Page 3: Live Transcript Delivery

Street Events MasterGrzegorz Giedrojc

Page 4: Live Transcript Delivery

Event Platform Grzegorz Kolpuc

Page 5: Live Transcript Delivery
Page 6: Live Transcript Delivery
Page 7: Live Transcript Delivery

Financial & Risk

IP & Science

Legal News

Tax & Accounting

Technology & OPS

Page 8: Live Transcript Delivery
Page 9: Live Transcript Delivery

60,000+ EMPLOYEES

10,000+IN TECHNOLOGY

Page 10: Live Transcript Delivery

1200+ EMPLOYEES IN GDYNIA

150+IN TECHNOLOGY

Page 11: Live Transcript Delivery

Eikon

Page 12: Live Transcript Delivery
Page 13: Live Transcript Delivery

Transcripts

Page 14: Live Transcript Delivery

Core Abstractions

● EVENT● TRANSCRIPT● BRIEF● Live TRANSCRIPT

Page 15: Live Transcript Delivery

Event

Earnings Release

Earnings CallEarnings Presentation

Page 16: Live Transcript Delivery

Event

Page 17: Live Transcript Delivery

Transcript

Page 18: Live Transcript Delivery

Brief

Page 19: Live Transcript Delivery

Transcript delivery

1.Planning2.Preparations3.Streaming4.Final version5.Audio synchronization

Page 20: Live Transcript Delivery

Planning

Page 21: Live Transcript Delivery

Preparations

Theoretical:➔ Company data➔ Potential participants➔ Historical experience

Technical:➔ TranXP➔ Dragon

Page 22: Live Transcript Delivery

22

TranXP

Page 23: Live Transcript Delivery

Dragon Training

Page 24: Live Transcript Delivery

Additional Equipment

Page 25: Live Transcript Delivery

Final Version

Page 26: Live Transcript Delivery

Audio Synchronization

External Service

Audio sync. Transcript / Brief

mp3

Transcript Brief

Page 27: Live Transcript Delivery

Movie demo

Grzegorz Kołpuć
[email protected] mozesz do 'note' wkleic link do tego filmiku?
Grzegorz Giedrojc
Najlepiej bedzie jak go sciagniesz od razu zeby lokalnie z twojego kompa sie odpalil:https://thehub.thomsonreuters.com/videos/15510
Grzegorz Kołpuć
a jak to pobrac?
Page 28: Live Transcript Delivery

Architecture

Page 29: Live Transcript Delivery

Streaming - High Lvl View TranXP External

Vendor

Blackbird

Event Platform

Customers

Page 30: Live Transcript Delivery

Blackbird

WSWSWSWS

APPAPPAPPAPP

Blackbird

40 cores256 GB

12 cores12 GB

Grzegorz Kołpuć
[email protected] troche rozbije ten diagram.
Grzegorz Kołpuć
tutaj jeszcz w poniedzialek trzeba by pochylic sie nad tym slajdem. Trzeba rozbic na kilka i opowiadac pokolei co do kad przechodz. Ja to moge zrobic, tylko najpierw musimy pogadac zebym bzdur nie nawymyslal.
Page 31: Live Transcript Delivery

LT Entry Point

TranXP

Dragon

Company Info

WSWSWSWS

APPAPPAPPAPP

Blackbird

Page 32: Live Transcript Delivery

Event Collection Tool

TranXP

Dragon

Company Info

WSWSWSWS

APPAPPAPPAPP

Blackbird

ECT

Page 33: Live Transcript Delivery

Audio Sync

TranXP

Dragon

Company Info

ECT

WSWSWSWS

APPAPPAPPAPP

Blackbird

Audio Sync.

Background process synchronize audio with

text

Page 34: Live Transcript Delivery

Transcript Distribution

TranXP

Audio Sync.

Dragon

Company Info

ECT

WSWSWSWS

APPAPPAPPAPP

Blackbird

Internal LT Broker

Distribute LT internally

Grzegorz Kołpuć
[email protected] przydalo by sie tutaj zrobic jakis 'flow' tzn strzalkami co dokad idzie
Page 35: Live Transcript Delivery

Where we are so far...

● Transcripts○ Created by TR○ Delivered by vendors

● Internal distribution only (inside TR)

Page 36: Live Transcript Delivery

Our goal

● Deliver live Transcripts to external clients○ In distributed manner○ Scalable○ HA

● Evolution , not Revolution

Page 37: Live Transcript Delivery

Event Platform

● Main ‘Events’ provider across TR● Aggregates various contents and serve in as events

● Should serve Live Transcript to internal and external client

Page 38: Live Transcript Delivery

Transcript Receiver

Transcript Receiver

Page 39: Live Transcript Delivery

Messaging ??

● Base features:○ Message order○ Parallel consumers

● Distributed and scalable● Fault tolerant● No data loss (replication)● May repeat messages (Nice to have)

Page 40: Live Transcript Delivery

Solution?

Page 41: Live Transcript Delivery

Publish-subscribe messagingDistributed and scalablePartitioned Topics

Each partition may be consumed independently

No server side ACKsAcknowledge responsibility is on consumer

side (increase throughput)Messages stored as distributed commit log

consumer can start reading from any point of time (basing on offsets)

Apache Kafka

Page 42: Live Transcript Delivery

Apache Kafka

Transcript Receiver

Kafka Cluster

Broker

Broker

Broker

Page 43: Live Transcript Delivery

What we need now?

● Processing engine to consume feed○ Very low latency○ Open source○ Stream grouping (no race condition)

● Distributed and scalable○ Configurable parallelism

● Fault tolerant● No data loss

Page 44: Live Transcript Delivery

Solution?

Page 45: Live Transcript Delivery

➔ Popular real-time computation systems➔ Massively used in production by many

companies◆e.g. Twitter, Yahoo, Spotify

➔ Distributed, scalable and fault-tolerant➔ Open-sourced in 2011 by Nathan Marz➔ Written in Clojure

Apache Storm

Page 46: Live Transcript Delivery

➔ Execute topology (storm program) in distributed manner

➔ Topology is running as set of Spouts and Bolts

➔ Single message is represented as Tuple➔ Unbound chain of tuples is a Stream

Core Concepts

Page 47: Live Transcript Delivery

Storm Cluster

Node 1

Storm Daemons - Nimbus

Nimbus

Node 2

Node 3

Distributes code around the cluster

Assigns tasks to machinesMonitors for failures

Page 48: Live Transcript Delivery

Storm Cluster

Node 1

Storm Daemons - Supervisor

Nimbus

Supervisor

Node 2

Supervisor

Node 3

Supervisor

Starts and stops worker processes based on what Nimbus has assigned to it

Executes a subset of a topology - spouts and/or bolts

Page 49: Live Transcript Delivery

Storm Cluster

Node 1

Fail Fast

Nimbus

Supervisor

Node 2

Supervisor

Node 3

Supervisor

● Runs under supervision

Page 50: Live Transcript Delivery

Storm Cluster

Node 1

Workers

Nimbus

Supervisor

W1

W1W1

W1

Node 2

Supervisor

W1

W1W1

W1

Node 3

Supervisor

W1

W1W1

W1

Processing unitsExecute bolts and spouts

as Executors and tasks

Page 51: Live Transcript Delivery

Storm Parallelism

Page 52: Live Transcript Delivery

Storm Cluster

Node 1

Storm Topology

Transcript Receiver

Kafka Cluster

Broker

Broker

Broker

Nimbus

Supervisor

W1

W1W1

W1

Node 2

Supervisor

W1

W1W1

W1

Node 3

Supervisor

W1

W1W1

W1

Page 53: Live Transcript Delivery

Storm Cluster

Node 1

Client Delivery

Transcript Receiver

Kafka Cluster

Broker

Broker

Broker

Nimbus

Supervisor

W1

W1W1

W1

Node 2

Supervisor

W1

W1W1

W1

Node 3

Supervisor

W1

W1W1

W1

Ext

erna

l Clie

nts

Page 54: Live Transcript Delivery

Live Transcript BridgeStorm Cluster

Node 1

LTB

Transcript Receiver

Kafka Cluster

Broker

Broker

Broker

Nimbus

Supervisor

W1

W1W1

W1

Node 2

Supervisor

W1

W1W1

W1

Node 3

Supervisor

W1

W1W1

W1

Ext

erna

l Clie

nts

Page 55: Live Transcript Delivery

What we achieved● High speed live transcript delivery○ Messages are sent with subseconds latency

● Processing in distributed manner○ Replication○ Fault tolerance

● Ready for more customers and more transcripts○ All components may be scaled horizontally

Page 56: Live Transcript Delivery

QUESTIONS ?

Page 57: Live Transcript Delivery

Thank You