Leading publisher digitized over 2.5 billion documents using Smart Digital Document Digitization Solution


Archived Content Digitized


Sources Connected


Historical Pages Served


Automatic Data Extraction

About the Client


A Renowned Digital newspaper library


News & Media (Digital Archiving)




It provides digitization of library, including books, periodicals and literature archives gives users the ability to search for and collect information easily

Project Overview — Digitization of newspaper library resources using simplified and automated Document Archiving technique

In the digital world, people want easy access of information curated in their handy gadgets. And when it comes to reading, people mostly use online reading apps and download pdfs to read it later. They mostly like to read it either on their phones, tablets or laptops. There are a lot of online reading resources and thousands of online platforms available but there is hardly any single platform for books, periodicals and literature of the past.

Digital archiving comes as a boon in this situation. Digitization of library, including books, periodicals and literature archives gives users the ability to search for and collect information easily. It helps in research, determine trends with historical and political analysis.

The client wanted to develop a platform that provides digital archiving and intelligent document processing using advanced OCR technology. However, the cost to convert analog books to digital was huge and they were trying to find a cost-effective and faster way to digitize over a million books from their archives. The solution was tough but never impossible. The expert solution that we developed for them, enabled them to create a digital archive in a format to efficiently manage and preserve history while saving space, time and money.

The platform comes with a digital scanning or imaging, which turns documents into a word-searchable format and users can search for a document using a keyword, read those files and can securely store the files.

Solving paper-based archiving issues with Digital Archiving

Digital archiving solutions need to reliably store and retrieve high volumes of data. Manually searching for archived data from paper sources can consume a lot of time and searching a particular document or book from the vast ocean of books can sometimes be impossible.

To solve this issue, the online platform with digitized content that we developed allows users to read and search content with just simple keywords. The readers can access historical news for research through the digital content using filters and keywords.

Multiple Libraries & Historical Societies, Newspaper Publishers, Government Agencies and Educational Institutions were connected together to serve contents.

Paper-based archiving Issues presented

  • Difficult to access physical archive of publications

    It was difficult to go to multiple public libraries for accessing the physical archive of books, newspapers and magazines. There was a need for a single platform from where all these resources could be accessed.

  • Difficult to search a particular resource

    It was almost impossible to search and find a particular resource from the vast library of archived contents for research

  • Costly and time consuming

    Important documents such as newspapers, articles can reach significant volumes and take up precious and expensive storage space, while manually searching for relevant data can consume an extraordinary amount of time.

  • Unavailability of online archives

    There was a need for an online platform for readers to view and read archived contents of the past.

Business Benefits Achieved

  • The solution helped a daily newspaper publisher to digitize a massive, over 2 billion records.
  • The platform offered end-to-end archiving services starting from scanning, OCR, storage process with SOLR search Engine implementation.
  • DreamzTech team designed and developed a tool for online readers/researchers to seamlessly search and read the papers online through Mobile and Tablet.
  • The tool was loaded with robust utility features such as highlight, snip, zoom and download.

2.5B +

Archived Content Digitized

500 +

Sources Connected (Libraries & Historical Societies, Newspaper Publishers etc.)

10M +

Historical Pages Served


Automatic Data Extraction

Solutions provided by DreamzTech

Digitization of entire library resources

Creation of an online platform for digitized content from which the readers could access historical news for research by searching and reading the content

Advanced search engine with SOLR technology

Implementation of Apache Solr on the platform that enables users to search through the digital content using filters and keywords.

Multiple sources integrated together

Multiple Libraries & Historical Societies, Newspaper Publishers, Government Agencies and Educational Institutions were connected together to serve contents.

Subscription management module

The platform comes with an integrated subscription management module to manage subscriptions and memberships of paid members.

Key Features

How DreamzTech helped and ensure end-to-end solution

Requirement Gathering and Analysis

Onshore Business Analysts from DreamzTech interviewed stakeholders from client premises and gathered requirements for the project. They worked with offshore Business Analyst and Solution Architects for analyzing the business objectives as discussed with the client.

Project Planning and Execution

The dedicated team assigned to the project analyzed the objectives of the project and the probable solutions to determine technologies to be used, database structure, data flow, and microservices. Agile Project plan and backlog document and sprints are planned.

UAT and Deployment

Our team consisting of Project Managers, Tech Leads, Designers, Developers, and Quality Analysts worked with the client for over 6 months to deploy first alpha and continued for 5 years following Agile methodology to release multiple versions and updates.


Overview of Services Provided

  • Consultation

    Business Analysts and Solution Architects from DreamzTech communicated with the client and created a backlog of tasks. All tasks are evaluated and prioritized after consultation and taking approval from the client.

  • UX/UI design

    Based on requirements, design team prepared based wireframes and presented prototypes for approval and confirmation and revised based on feedback from client.

  • Project management

    A dedicated team for developing the application for Android and iOS was formed, consisting of Project Manager (PM), Tech Leads, Developers, Quality Analysts (QAs) was formed. Tasks were divided between teams with specific delivery deadlines

  • Integration

    Various integrations such as Apache SOLR, Abbyy OCR, Payment Processor like Stripe, Paypal were integrated.

  • Quality control

    Based on test cases prepared and approved, quality analysis, A|B Test, Usability testing etc were conducted and issues were identified and fixed.

  • UAT & release

    After completed QA, a alpha release was made. Our Quality Analysts performed manual and automated tests to make sure of quality of delivery and industry standards before we deliver for User acceptance.

Want to launch a project that can actually solve the problem you are trying to solve ?