Document Transformation Infrastructure

From wiki.gpii
Jump to: navigation, search

Introduction

This project is being led by Lars Ballieu Christensen and Sensus in Denmark and is based on their award winning RoboBraille and SensusAcccess work. RoboBraille is a web and email-based service capable of converting documents into a variety of alternate formats including digital Braille, audio books in MP3 and DAISY format, and e-books in EPUB, EPUB3 and Mobi Pocket format. The service can also be used to convert otherwise inaccessible documents (e.g., image-only PDF-files, JPG pictures of text) and tricky document types (e.g., Microsoft PowerPoint presentations) into formats that are easier to use by print-impaired users. RoboBraille is intended for individual, non-commercial use and does not require registration. SensusAccess provides services similar to those of RoboBraille and is intended for non-commercial, institutional use amongst academic institutions, and does require a service agreement with Sensus.

This project will create an open source document transformation engine that, while it will not have all of the capability and power of the RoboBraille/SensusAccess services, will allow people or organizations to set up ad hoc or commercial service to transform documents into accessible form. (This will eventually include documents with embedded media.) In addition it will be designed in a modular fashion to allow others to be able to contribute to the open-source effort, creating new or improved modules to increase the capabilities and power of the open-source version over time. Finally, the project will contribute an open-source Braille translation module to the GPII DeveloperSpace.

All development will be completed within the framework of Microsoft .NET.

Description of the capabilities of the eventual infrastructure

From an architectural point of view, the document transformation engine will consist of the following modules:

1. A job-recipient module where the requester (human or system) will provide the source document as well as information about the requested transformation.

2. A job-processing module that will attempt to convert the source document into the requested target format in accordance with the job request.

3. A job-delivery module that will deliver the result of the requested transformation or a suitable error message to the requester.

The modules comprising the document transformation engine can be implemented in a distributed module spanning multiple geographies or they can be configured to run on a single site. As the document transformation engine is self-contained, it can even be set up to run behind a firewall in a protected environment.

Job-recipient module

The job-recipient module is a web service that will allow requesters to submit job requests to the document transformation infrastructure. A job request consists of (1) a source document; (2) a definition of the requested target format (e.g., digital Braille, MP3 audio file, DAISY full text/full audio project, E-book, Accessibility; (3) a set of job options qualifying the target format (e.g., Braille code, contraction level and Braille out format for Braille documents, language, gender and audio speed for MP3 files, e-book format for e-books and document format for accessibility conversions; and (4) delivery instructions and details (e.g., email reply, ftp upload, call-back). The job-recipient web service stores job requests in a job queue in an SQL database. Multiple queues are used to manage load-balancing as well as scarce resources (e.g., software licenses).

Job-processing module

The job-processing module constitutes a processing agent (aka a RoboBraille agent). A processing agent reads from one or more job queues, determines the requested target format, processes the requests in accordance with the job options and hands over control to the job-delivery module to return the result to the requester. In processing the job, the processing agent exploits one or more conversion services. The conversion services include (1) OCR services; (2) Office Conversion Services; (3) Audio Services; (4) DAISY Services; (5) Braille Services; and (6) E-book Services. In addition to these existing service, this project will define the interface to three additional services: (7) OSR Services; (8) Translation Services; and (9) Sign-language Services. Each of the services are described in the sections below:

Robobraille OCR Services

OCR Services are used to convert various type of PDF files as well as a variety of image-type documents (e.g., files in TIFF, GIF, JPG, BMP format) into files formats that are either requested directly by the requester or required for the further processing of the source document. OCR Services rely on a third-party commercial or open-source OCR engine.


Keywords: OCR,graphics,braille
Technologies: {{{tech}}}
License: {{{license}}}
FurtherInfo: http://robobraille.org/, Contact Lars Ballieu Christensen

NOTE: If you find this component useful or want to comment leave a short message on the discussion page of this component


Robobraille Office Conversion Services

Office Services are used to convert Microsoft Word, Microsoft PowerPoint and RTF files into files formats that are either requested directly by the requester or required for the further processing of the source document. Office Services rely on Microsoft Office.


Keywords: office,word,xls,ppt,rtf
Technologies: {{{tech}}}
License: {{{license}}}
FurtherInfo: http://robobraille.org/, Contact Lars Ballieu Christensen

NOTE: If you find this component useful or want to comment leave a short message on the discussion page of this component


Robobraille Audio Services

Audio Services are used to create MP3 files from submitted documents. Prior to conversion, all documents types are converted into text files in Unicode format, potentially using either the OCR Services or the Office Conversion Services. First, a document is converted into a WAV file. The WAV file is subsequently compressed into an MP3 file. Audio Services are implemented using Microsoft SAPI and rely on third party SAPI 5-compliant commercial or open-source text-to-speech engines.


Keywords: mp3,wav,braille
Technologies: {{{tech}}}
License: {{{license}}}
FurtherInfo: http://robobraille.org/, Contact Lars Ballieu Christensen

NOTE: If you find this component useful or want to comment leave a short message on the discussion page of this component


Robobraille DAISY Services

DAISY Services are used to create structured audio books in the DAISY format, including DAISY full text/full audio, DAISY text only and DAISY audio only. Prior to conversion, all documents types are converted into Office Open XML files, potentially using either the OCR Services or the Office Conversion Services. The service is implemented using third party open-source software components (Daisy Pipeline, SaveAsDaisy) and Microsoft Office. DAISY Services furthermore rely on third party SAPI 5-compliant commercial or open-source text-to-speech engines.


Keywords: mp3,wav,braille
Technologies: {{{tech}}}
License: {{{license}}}
FurtherInfo: http://robobraille.org/, Contact Lars Ballieu Christensen

NOTE: If you find this component useful or want to comment leave a short message on the discussion page of this component


Robobraille Braille Services

Braille Services are used to create and format digital Braille documents. Prior to conversion, all documents types are converted into text files, potentially using either the OCR Services or the Office Conversion Services. Braille documents can be delivered in Unicode, in the native Braille character set of a Braille device or in the Portable Embosser Format (PEF). Braille services rely on third party commercial or open-source Braille transcription software.


Keywords: PEF,braille
Technologies: {{{tech}}}
License: {{{license}}}
FurtherInfo: http://robobraille.org/, Contact Lars Ballieu Christensen

NOTE: If you find this component useful or want to comment leave a short message on the discussion page of this component


Robobraille E-book Services

E-book services are used to create e-books in EPUP, EPUB3 or Mobi Pocket format. Prior to conversion, all documents types are converted into Office Open XML files, potentially using either the OCR Services or the Office Conversion Services. The service is implemented using third party open-source software components (Daisy Pipeline, SaveAsDaisy, Calibre) and Microsoft Office.


Keywords: epub,mobi
Technologies: {{{tech}}}
License: {{{license}}}
FurtherInfo: http://robobraille.org/, Contact Lars Ballieu Christensen

NOTE: If you find this component useful or want to comment leave a short message on the discussion page of this component

Robobraille OSR Services

This is a service stub and prototype implementation of a module that will attempt to recognize the semantic structure of documents containing tabular data.

Robobraille Translation Services

This is a service stub that will provide an interface to a future language-to-language translation service. The interface will allow requesters to specify source and target languages as well as translation options.

Robobraille Sign-language Services

This is a service stub that will provide an interface to a future text-to-animated sign language translation service. The interface will allow requesters to specify source and target languages as well as translation options.

Development efforts and approximate timelines

The following deliveries have been scheduled:

End of Year 1

Delivery of a web-service based job-recipient module to the existing RoboBraille service. This module will enable third party systems (e.g., digital library systems, learning management systems, apps) to interface with the RoboBraille service and request document conversions. Sensus also anticipates to deliver a demonstration app for iOS for converting JPG-pictures to MP3 audio files.

End of Year 2

Delivery of an open-source Braille translation module. Sensus anticipates to implement a Braille conversion module that supports Unified English Braille (UEB).

End of Year 3

Delivery of the first version of the document transformation engine to be deployable in the the crowdsourcingX2/Cloud4all infrastructure along with a draft implementation guide (delivery D204.4).

End of Year 4

Delivery of a refined stand-alone version of the document transformation engine and an implementation guide for integrating the document transformation engine with third party systems (delivery D204.4) Delivery of modular interfaces for the OSR services prototype and the stubs for the translation and sign-language services (delivery D204.6)