Live Transcript Delivery

Post on 13-Feb-2017

196 views 1 download

Transcript of Live Transcript Delivery

Live Transcript Delivery

Scalable real-time text streaming

GRZEGORZ GIEDROJCGRZEGORZ KOLPUC

Grzegorz Giedrojc
+grzegorzkolpuc@gmail.com Ten slajd chyba tez wylatuje? Bo rozumiem ze przchodzimy na pomaranczowe slajdy rozdzielajace czesci prezentacji?
Grzegorz Kołpuć
tak, chyba bedzie spojniej.

Grzegorz Giedrojc Grzegorz Kolpuc

Street Events MasterGrzegorz Giedrojc

Event Platform Grzegorz Kolpuc

Financial & Risk

IP & Science

Legal News

Tax & Accounting

Technology & OPS

60,000+ EMPLOYEES

10,000+IN TECHNOLOGY

1200+ EMPLOYEES IN GDYNIA

150+IN TECHNOLOGY

Eikon

Transcripts

Core Abstractions

● EVENT● TRANSCRIPT● BRIEF● Live TRANSCRIPT

Event

Earnings Release

Earnings CallEarnings Presentation

Event

Transcript

Brief

Transcript delivery

1.Planning2.Preparations3.Streaming4.Final version5.Audio synchronization

Planning

Preparations

Theoretical:➔ Company data➔ Potential participants➔ Historical experience

Technical:➔ TranXP➔ Dragon

22

TranXP

Dragon Training

Additional Equipment

Final Version

Audio Synchronization

External Service

Audio sync. Transcript / Brief

mp3

Transcript Brief

Movie demo

Grzegorz Kołpuć
+grzesiek.giedrojc@gmail.com mozesz do 'note' wkleic link do tego filmiku?
Grzegorz Giedrojc
Najlepiej bedzie jak go sciagniesz od razu zeby lokalnie z twojego kompa sie odpalil:https://thehub.thomsonreuters.com/videos/15510
Grzegorz Kołpuć
a jak to pobrac?

Architecture

Streaming - High Lvl View TranXP External

Vendor

Blackbird

Event Platform

Customers

Blackbird

WSWSWSWS

APPAPPAPPAPP

Blackbird

40 cores256 GB

12 cores12 GB

Grzegorz Kołpuć
+grzesiek.giedrojc@gmail.com troche rozbije ten diagram.
Grzegorz Kołpuć
tutaj jeszcz w poniedzialek trzeba by pochylic sie nad tym slajdem. Trzeba rozbic na kilka i opowiadac pokolei co do kad przechodz. Ja to moge zrobic, tylko najpierw musimy pogadac zebym bzdur nie nawymyslal.

LT Entry Point

TranXP

Dragon

Company Info

WSWSWSWS

APPAPPAPPAPP

Blackbird

Event Collection Tool

TranXP

Dragon

Company Info

WSWSWSWS

APPAPPAPPAPP

Blackbird

ECT

Audio Sync

TranXP

Dragon

Company Info

ECT

WSWSWSWS

APPAPPAPPAPP

Blackbird

Audio Sync.

Background process synchronize audio with

text

Transcript Distribution

TranXP

Audio Sync.

Dragon

Company Info

ECT

WSWSWSWS

APPAPPAPPAPP

Blackbird

Internal LT Broker

Distribute LT internally

Grzegorz Kołpuć
+grzesiek.giedrojc@gmail.com przydalo by sie tutaj zrobic jakis 'flow' tzn strzalkami co dokad idzie

Where we are so far...

● Transcripts○ Created by TR○ Delivered by vendors

● Internal distribution only (inside TR)

Our goal

● Deliver live Transcripts to external clients○ In distributed manner○ Scalable○ HA

● Evolution , not Revolution

Event Platform

● Main ‘Events’ provider across TR● Aggregates various contents and serve in as events

● Should serve Live Transcript to internal and external client

Transcript Receiver

Transcript Receiver

Messaging ??

● Base features:○ Message order○ Parallel consumers

● Distributed and scalable● Fault tolerant● No data loss (replication)● May repeat messages (Nice to have)

Solution?

Publish-subscribe messagingDistributed and scalablePartitioned Topics

Each partition may be consumed independently

No server side ACKsAcknowledge responsibility is on consumer

side (increase throughput)Messages stored as distributed commit log

consumer can start reading from any point of time (basing on offsets)

Apache Kafka

Apache Kafka

Transcript Receiver

Kafka Cluster

Broker

Broker

Broker

What we need now?

● Processing engine to consume feed○ Very low latency○ Open source○ Stream grouping (no race condition)

● Distributed and scalable○ Configurable parallelism

● Fault tolerant● No data loss

Solution?

➔ Popular real-time computation systems➔ Massively used in production by many

companies◆e.g. Twitter, Yahoo, Spotify

➔ Distributed, scalable and fault-tolerant➔ Open-sourced in 2011 by Nathan Marz➔ Written in Clojure

Apache Storm

➔ Execute topology (storm program) in distributed manner

➔ Topology is running as set of Spouts and Bolts

➔ Single message is represented as Tuple➔ Unbound chain of tuples is a Stream

Core Concepts

Storm Cluster

Node 1

Storm Daemons - Nimbus

Nimbus

Node 2

Node 3

Distributes code around the cluster

Assigns tasks to machinesMonitors for failures

Storm Cluster

Node 1

Storm Daemons - Supervisor

Nimbus

Supervisor

Node 2

Supervisor

Node 3

Supervisor

Starts and stops worker processes based on what Nimbus has assigned to it

Executes a subset of a topology - spouts and/or bolts

Storm Cluster

Node 1

Fail Fast

Nimbus

Supervisor

Node 2

Supervisor

Node 3

Supervisor

● Runs under supervision

Storm Cluster

Node 1

Workers

Nimbus

Supervisor

W1

W1W1

W1

Node 2

Supervisor

W1

W1W1

W1

Node 3

Supervisor

W1

W1W1

W1

Processing unitsExecute bolts and spouts

as Executors and tasks

Storm Parallelism

Storm Cluster

Node 1

Storm Topology

Transcript Receiver

Kafka Cluster

Broker

Broker

Broker

Nimbus

Supervisor

W1

W1W1

W1

Node 2

Supervisor

W1

W1W1

W1

Node 3

Supervisor

W1

W1W1

W1

Storm Cluster

Node 1

Client Delivery

Transcript Receiver

Kafka Cluster

Broker

Broker

Broker

Nimbus

Supervisor

W1

W1W1

W1

Node 2

Supervisor

W1

W1W1

W1

Node 3

Supervisor

W1

W1W1

W1

Ext

erna

l Clie

nts

Live Transcript BridgeStorm Cluster

Node 1

LTB

Transcript Receiver

Kafka Cluster

Broker

Broker

Broker

Nimbus

Supervisor

W1

W1W1

W1

Node 2

Supervisor

W1

W1W1

W1

Node 3

Supervisor

W1

W1W1

W1

Ext

erna

l Clie

nts

What we achieved● High speed live transcript delivery○ Messages are sent with subseconds latency

● Processing in distributed manner○ Replication○ Fault tolerance

● Ready for more customers and more transcripts○ All components may be scaled horizontally

QUESTIONS ?

Thank You