| |
Elasticsearch At Stack Overflow
|
With Andrew Montaleni (Founder/CTO, Parsely) & Peter Soderberg (Solns Architect, Elastic). |
| Stack Exchange, 110 William St |
|
Oct 22 (Tue) , 2019 @ 06:00 PM
| |
FREE |
|
|
|
|
|
|
|
|
|
DETAILS |
|
Heya all!
As we kick into fall, we are super excited to announce our next Meetup at Stack Overflow. We welcome Andrew Montaleni, co-founder & CTO of Parse.ly, as our guest speaker. Andrew will be giving a talk titled: Improving High-Cardinality Analytics with Index-Stored HyperLogLogs
From the Elastic side, Peter Soderberg, Elastic Solutions Architect, will be speaking about the first open source APM: Elastic APM.
NB: Stack Overflow is located on the 28th floor.
**BONUS: We will be giving away a free ticket to the Elastic{ON} Tour - NY at this event.** (https://www.elastic.co/elasticon/tour/new-york)
----
Talk 1: Improving High-Cardinality Analytics with Index-Stored HyperLogLogs
Abstract: One of Elasticsearch's most powerful analytics features is the "cardinality" aggregation, which can do blazing-fast distinct counts across millions of high-cardinality document values by leveraging the probabilistic data structure, HyperLogLog++ (aka HLL). In this presentation, Andrew Montalenti, the co-founder & CTO of Parse.ly, will discuss the open source work his team has done to bring "index-stored HLLs" to Elasticsearch. We'll begin with a discussion of what HLL is & how it works -- in other words, how cardinality aggregations work, under-the-hood. Then we'll discuss why massive cost & performance savings (on the order of 10x) can come from index-storage of serialized HLLs, with the key trade-off being individual value searchability. Finally, we'll showcase how a small team of (primarily Python) programmers managed to get its head around Elasticsearch's Java codebase in order to build a new custom aggregation & a new index type to support this use case, & why we're releasing this functionality as open source. We'll close with a discussion of the broader effort for index-stored data sketches in ES, which, in the future, might include the percentiles aggregation & its TDigest data structure.
Bio: Andrew Montalenti (@amontalenti) is the co-founder & CTO of Parse.ly (https://parse.ly), the creator of the top audience analytics system for content teams. Parse.ly tech is leveraged by top sites like Arstechnica, Bloomberg, & The Wall Street Journal, empowering content pros with real-time & historical analytics over all their web assets, in every digital channel. Andrew has over a decade of experience in finance, high tech, & online media, & earned his Computer Science degree at NYU. As a dedicated Pythonista, JavaScript hacker, & open source advocate, he works at the intersection of large-scale distributed systems, real-time measurement, & content analysis technologies. Relevant to Elasticsearch users, he is the author of "Lucene: The Good Parts", & was a presenter at Elastic{ON} in 2015, on "Web Content Analytics at Scale". He has also presented at PyData, PyCon, & several other technology conferences.
Talk 2: Elastic APM
Abstract: Over the last few years, Elastic has focused on optimizing the Elastic Stack for the use cases adopted by our community. This talk will focus on Elastic APM - the first open source APM solution. We'll explore instrumenting an application, the data model for metrics & logs, distributed tracing, & real user monitoring--the tools that enable realtime monitoring & intuitive troubleshooting.
Bio: Peter Soderberg is a Solutions Architect at Elastic based in Brooklyn, NY. He's spend years gleaning insights from diverse data, using tools like Elasticsearch & the Hadoop ecosystem.
---
Looking forward to seeing everyone in a few weeks!
PS: For those who missed our July Meetup at Vimeo, you can find the recording here: vim.io/2XLxPeR .
Kind regards, Danielle
|
|
|
|
|
|
|
|