NYC  SF        Events   Jobs   Deals  
    Sign in  
 
 
NYC Tech
Events Weekly Newsletter!
*
 
COMING UP

 
 
 
 
 
 
 
 
 
Sponsored Event 
With Mark Gross (President, Data Conversion Lab), Isu Shrestha (Sr ML Enggr, Fusemachines).
Venue, Online
Feb 01 (Wed) , 2023 @ 12:00 PM
FREE
 
Register
 
 

 
DETAILS

Extracting & structuring content from text- or image-based tables has long been a challenge. Transforming tabular content into a structured model such as XML or HTML is nearly always a manual or semi-manual process. Tabular content is particularly important in regulatory, financial, & scientific documents where complex alphanumeric content is often presented in tabular format. Tables are tough to structure due to inconsistencies with tabular content, high diversity of layouts, complicated elements such as straddle headings, various alignments of contents, the presence of empty cells, & other intricacies.

Data Conversion Laboratory & Fuse Machines created an AI model that finds & extracts information from all tables in a document using a combination of Computer Vision (CV) & Natural Language Processing (NLP). We'll review how we developed & managed a hybrid approach of rules-based processes & machine-learning to identify & extract tabular data, & augmented training data to develop an AI model that automates table-to-XML extraction. This presentation dives into the details of why the automated process of table structure is important, why we took the approaches we did, & how one can measure the efficacy of table identification & extraction.

SPEAKERS
Mark Gross, President, Data Conversion Laboratory
Isu Shrestha, Senior Machine Learning Engineer
 
 
 
 
About    Feedback    Press    Terms    Gary's Red Tie
 
© 2025 GarysGuide