Insights and Resources

    The data dilemma: Is it a document or a record?

    05 Feb 2020

    Today there are 2.7 Zettabytes of digital data in the world and people are generating 1.7 megabytes of data per second. 

    That’s not including the weight of physical information in the form of paper-based documents. So how are businesses collecting and managing such a wide variety of data and are they in danger of focusing on the collection to the detriment of how that data is classified, stored and retrieved?


    Initially, businesses used document management to digitally scan, upload and store documents, leading to the emergence of the document management solution (DMS). Features to help track and manage digitalised information included version control, annotations and time stamps, track changes, and audit trail capabilities. The growth of the internet and rise of digital media then saw DMS become incorporated into enterprise content management systems, allowing websites, social media, text, voice, and video to be added into the mix.


    Fast forward to the modern day, and organisations now have either a legacy DMS or a content management system already in place. Yet neither is now really delivering the precision required for records management. Both adopt a generic catch-all approach with everything classed as a “document” regardless of its origins. This makes it very difficult to treat media as a “record” which has a very different set of criteria and distinct management needs.


    Defining records


    ISO 15489-1 defines record management as being “responsible for the efficient and systematic control of the creation, receipt, maintenance, use and disposition of records, including the processes for capturing and maintaining evidence of, and information about, business activities and transactions in the form of records”. In other words, records are often not only static documents but can take an infinite variety of forms as they also chart processes and outcomes.


    Records can “provide information about what happened, what was decided and how to do things”, according to guidance from the National Archives and act as “evidence of past actions and decisions” giving them legal gravitas. Consequently, unlike a document, a record must have an identifiable origin so that it is possible to authoritatively show it has not been altered after its status as a record was assigned.


    There remains some debate over whether “records” and “information” should be interchangeable terms. This is because certain compliance requirements such as the Data Protection Act and Freedom of Information Act refer to both. The National Archives suggests records may be seen as a “subset of information with particular qualities… such as structure, context and authenticity that come from the managed environment in which the records are kept and give them value as reliable evidence” but equally agrees these records may now simply be classed as “information”.


    Context is key


    Monikers aside, digital records require different capture processes from common or garden documents. Content management will typically process data by volume, using content processing to perform video capture, translation, and audio transcribing, for instance.


    Conversely, with regards to records, it’s imperative that the context of the record is obtained through the capture of accurate and relevant metadata. This then allows the record to be stored, managed and indexed correctly but also aligned with other relevant data, adding real value when it comes to searchability. 


    To achieve this level of detail, a dedicated records management solution is required that uses machine learning to identify patterns and connections as well as classify data. This can be trained to use the categories, weightings and confidence levels of the existing business classification scheme to identify, analyse and detect key concepts within documents. Once the system has learnt the nuances of the classification scheme, it can even classify data by itself in compliance with governance-based processes within the Enterprise Content Management system.


    Automated but adjustable


    Because the system is taught how to recognise the attributes of data it can differentiate between different kinds of documents, ensuring documents are treated as documents and records as records. But while the system automates capture and classification, it can also be tweaked manually. Record managers can adjust the rules, weights and confidence levels to improve both classification accuracy and lifecycle management and monitor how these are applied.


    Machine learning and automation is, therefore, once again changing the face of document and record management. Content management is now providing the business with the ability to classify and contextualise information, improving retrieval by ensuring the relationship between data is made evident, while ensuring records are preserved in an authoritative, secure and finalised state. It’s this ability to distinguish between documents and records and to treat each accordingly that will solve the data dilemma. Then perhaps we’ll finally see records management come out from under the shadow of document management.



    If you need help identifying documents and records in your organisation, contact us today to speak to one of our experts.




    New call-to-action

    Our Latest Tweets