Blog

Artificial Intelligence Document Automation: “How To” 3 simple steps webinar takeaways

AI-enabled smart document management systems have become a necessity for all businesses. From automated management and classification of a bulk load of documentation to data extraction and security, AI Document Automation system is revolutionizing the modern workplace.

Fusemachines conducted the first of three webinars on Tuesday, February 9th with Carol (Cee) Bunevich VP of Partnerships at Fusemachines, partner Jim de Vries, Founder of Enhance International Group (EIG), and Bülent Uyaniker, Fusemachines PhD consultant and Founder of DataSpeckle.

EIG stands to challenge the prevailing management practices by collaborating with companies on turnaround management, corporate restructuring, productivity, and performance improvement for companies and stakeholders. We use a variety of tools and methodologies including Lean Six Sigma to assess and analyze corporate processes and practices. EIG prides themselves on a strong customer interface and feedback to work in partnership to drive change. The team is comprised of seasoned executives with global management experiences.

Tuesday’s webinar focused on AI augmented OCR (Optical Character Recognition) and its applications. The hosts highlighted how to leverage and enable businesses’ vast amounts of documentation and discussed 3 steps for achieving document automation with OCR.

What is OCR?

OCR is a technology that transforms text in the form of an image, photo (e.g., text on signs and billboards in a landscape photo, subtitle text from a TV broadcast) or typed, printed or handwritten PDF document into a machine-encoded text that can be analyzed.

How an OCR engine works:

  • Image Acquisition
  • Pre-processing
  • Document and Layout Analysis
  • Character recognition
  • Verification of Recognition Results
  • Output

Here are our takeaways.

OCR History

Between 1890-1931 the earliest ideas of OCR were conceived leading to the development of Fournier d’ Albe’s Optophone and Tauschek’s Reading Machines devices helping the blind to read. Later between the 1950s and 1970s the first portable OCR device, Optacon, was developed with similar devices that were used to digitize Reader’s Digest coupons and postal addresses. OCR technology was made available as a service (WebOCR), in cloud computing environments and in mobile applications in the form of language translation in early 2000s and since then the OCR technology has proliferated. The latest Google OCR tools can now scan any Google drive files in over 200 languages for free!

Technique: OCR Image to Data Transformation to Digital Data

The OCR scanner preprocesses the images via pattern and feature recognition and extracts image data. The image data is then transformed into machine encoded text (1s, 0s) so that analysis and decision-making can be performed by the engine to give digital data. The OCR processes the image of each page by recognizing the text character by character, word by word and line by line.

OCR and Hyper Automation/AI Augmented OCR

Digitizing paperwork only does half the job of documentation management and optimization. Leveraging hyper automation tools like machine learning, deep learning and computer vision can accelerate the use cases of OCR beyond just text translation. It can automate 50 to 70 percent of tasks which translates into 20 to 35 percent annual run-rate cost efficiencies and a reduction in straight-through process time of 50 to 60 percent with ROI most often in triple-digit percentages. 

OCR as a Technical Enabler in the Accounts Payable Continuum / AI OCR applications

ERP systems tie together different processes and enable the flow of information. The information can be digitized automatically through OCR thus enabling truly “paperless” transactions and processes. Other Machine learning and Deep learning techniques can be further used for automation of different processes based on business rules and visibility and further analytics.

AIDR Scoping Framework

The AIDR framework (A = Algorithm Feasibility, I = Impact, D = Data, and R = Recurrence) is a 4 step framework helping businesses prioritize and determine problems well suited for an AI Solution. Using this framework, one can determine if a project is AI-ready, what kind of ROI to expect, and what it would take to build an AI system to solve their organization’s problem. 

  • The first thing to consider is whether while this is a problem that lends itself to an AI solution (i.e., determining the viability of the algorithm)
  • Next, one needs to understand the impact of the solution by determining whether or not it results in a significant ROI
  • The third step is to discover the data challenges. Without a large amount of data, building an AI model is almost impossible
  • Lastly, find out how often the problems recur and consider what would happen should the problem be left unsolved. Putting in the time and effort to create an AI solution makes the most sense for recurring problems

For more information about EIG, please click here. To view the webinar, please click here.

Look out for our second webinar of the series on applying AI pattern recognition in different environments on March 23rd 1:00 PM EST. We will announce and open the free registration soon on our LinkedIn page.