Speed Optimization & Benchmarks In Spark NLP 3 - Making The Most Of Modern Hardware | NYC Tech Events - GarysGuide

Speed Optimization & Benchmarks In Spark NLP 3 - Making The Most Of Modern Hardware

With Maziyar Panahi (Sr Data Scientist & Spark NLP Lead, John Snow Labs).

	Venue, Online
	Jun 16 (Wed) , 2021 @ 02:00 PM
	FREE

DETAILS

Spark NLP is the most widely used NLP library in the enterprise, thanks to implementing production-grade, trainable, & scalable versions of state-of-the-art deep learning & transfer learning NLP research. It is also Open Source with a permissive Apache 2.0 license that officially supports Python, Java, & Scala languages backed by a highly active community & JSL members.

Spark NLP library implements core NLP algorithms including lemmatization, part of speech tagging, dependency parsing, named entity recognition, spell checking, multi-class & multi-label text classification, sentiment analysis, emotion detection, unsupervised keyword extraction, & state-of-the-art Transformers such as BERT, ELECTRA, ELMO, ALBERT, XLNet, & Universal Sentence Encoder.

The latest release of Spark NLP 3.0 comes with over 1100+ pretrained models, pipelines, & Transformers in 190+ different languages. It also delivers massive speeds up on both CPU & GPU devices while extending support for the latest computing platforms such as new Databricks runtimes & EMR versions.
The talk will focus on how to scale Apache Spark / PySpark applications in YARN clusters, use GPU in Databricks new Apache Spark 3.x runtimes, & manage large-scale datasets in resource-demanding NLP applications efficiently. We will share benchmarks, tips & tricks, & lessons learned when scaling Spark NLP.