Document Management System for Digital Archives

Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata based search

Get started

Open source

All source is licensed with very liberal Apache 2.0 license and is available for everyone on Github

User Friendly

It features beautiful, modern, easy to use and very intuitive web based user interface

OCR

Performs OCR on your documents, adding searchable and selectable text, even to documents scanned with only images

Versioning

Documents in Papermerge DMS are versioned. Originally uploaded version is always retained. Any operation you apply to your document will create new document version: for example OCRed version of you document is created as separate version. This feature comes very handy if you want to store multiple versions of the same document - say the same contract which was updated and thus is available in multiple versions.

Custom Fields

Custom fields are user defined attributes attached to your document, or to be exact - to category of documents. Example of custom fields for receipts: "price", "date of issue", "issuer". A common use case for custom fields is to give document a "custom ID", which may be a reference to the same document in external system, or maybe and ID of the document's physical location.

Custom fields are also known as documents' metadata.

You can visualize documents based on custom fields:

Page Management

Often scanning documents in bulk results in documents with out of order pages: some pages maybe rotated or maybe part of totally different document. Even if you notice these problems immediately it is time consuming to redo scanning process. Wouldn't it be nice to fix out of order pages without scanning all docs again?

Page management is set of features which helps to fix scanning process errors. You can reorder, rotate, and extract pages within document(s).

OCR

Papermerge DMS performs optical character recognition, abbreviated OCR, on your documents, adding searchable and selectable text, even to documents scanned with only images. It uses open-source Tesseract engine to recognize more than 100 languages.