Metadata

Metadata

Search engines evaluate meta tags to help decide a web page's relevance. Meta tags were used as the key factor in determining position in a search until the late 1990s. The increase in search engine optimization (SEO) towards the end of the 1990s led to many websites to keyword stuffing their metadata to trick search engines, making their websites seem more relevant than others.

Since then, search engines have reduced their reliance on meta tags, although they are still factored in when indexing pages. Many search engines also try to thwart web pages' ability to deceive their system by regularly changing their criteria for rankings, with Google being notorious for frequently changing its ranking algorithms.

Metadata can be created manually or by automated information processing. Manual creation tends to be more accurate, allowing the user to input any information they feel is relevant or that would help describe the file. Automated metadata creation can be much more elementary, usually only displaying information such as file size, file extension, when the file was created and who created the file.

Metadata use cases

Metadata is created anytime a document, a file or other information asset is modified, including its deletion. Accurate metadata can be helpful in prolonging the lifespan of existing data by helping users find new ways to apply it.

Metadata organizes a data object by using terms associated with that particular object. It also enables objects that are dissimilar to be identified and paired with like objects to help optimize the use of data assets. As noted, search engines and browsers determine which web content to display by interpreting the metadata tags associated with an HTML document.

The language of metadata is written to be understandable to both computer systems and humans, a level of standardization that contributes to better interoperability and integration between disparate applications and information systems.

Companies in digital publishing, engineering, financial services, healthcare and manufacturing use metadata to gather insights on ways to improve products or upgrade processes. For example, streaming content providers automate the management of intellectual property metadata so it can be stored across an array of applications, thus protecting copyright holders while at the same time making music and videos accessible to authenticated users.

The maturity of AI technologies is somewhat easing the traditional burden of managing metadata by automating previously manual processes to catalog and tag information assets.

History and origins of metadata

Jack E. Myers, founder of Metadata Information Partners (now The Metadata Co.), claims to have coined the term in 1969. Myers filed a trademark for the unhyphenated word "metadata" in 1986. Despite this, references to the term appear in academic papers that predate Myers' claim.

In an academic paper published in 1967, Massachusetts Institute of Technology professors David Griffel and Stuart McIntosh described metadata as "a record … of the data records" that result when bibliographic data about a topic is gathered from discrete sources. The researchers concluded that a "meta-linguistic approach," or "meta language," is needed to enable a computer system to properly interpret this data and its context to other relevant pieces of data. Unlike Myers, Griffel and McIntosh treated "meta" as a prefix to "data."

In 1964, an undergraduate computer science major named Philip R. Bagley started work on his dissertation, in which he argued that efforts to "make composite data elements" ultimately rests on the ability to "associate explicitly" to a second and related data element, which "we might term a 'metadata element.'" Although his thesis was rejected, Bagley's work, including his reference to metadata, subsequently was published as a report under a contract with the U.S. Air Force Office of Scientific Research in January 1969.

Types of metadata and examples

Metadata is variously categorized based on the function it serves in information management.

  • Administrative metadata allows administrators to impose rules and restrictions governing data access and user permissions. It also furnishes information on required maintenance and management of data resources. Often used in the context of ongoing research, administrative metadata includes such details as date created, file size and type, and archiving requirements.
  • Descriptive metadata identifies specific characteristics of a piece of data, such as bibliographic data, keywords, song titles, volume numbers, etc.
  • Legal metadata provides information on creative licensing, such as copyrights, licensing and royalties.
  • Preservation metadata guides the placement of a data item within a hierarchical framework or sequence.
  • Process metadata outlines procedures used to collect and treat statistical data. Statistical metadata is another term for process metadata.
  • Provenance metadata, also known as data lineage, tracks the history of a piece of data as it moves throughout an organization. Original documents are paired with metadata to ensure that data is valid or to correct errors in data quality. Checking the provenance is a customary practice in data governance.
  • Reference metadata relates to information that describes the quality of statistical content.
  • Statistical metadata describes data that enables users to properly interpret and use statistics found in reports, surveys and compendium.
  • Structural metadata reveals how different elements of a compound data object are assembled. Structural metadata is often used in digital media content, such as describing how pages in an audiobook should be organized to form a chapter, and how chapters should be organized to form volumes, and so on. The term "technical metadata" is a synonym most closely associated with items in digital libraries.
  • Use metadata is data that is sorted and analyzed each time a user accesses it. Based on analysis of use metadata, business can pick out trends in customer behavior and more readily adapt their products and services to meet their needs.

To view or add a comment, sign in

More articles by Shruti Anand

  • Machine learning

    Machine learning

    Machine learning is a branch of artificial intelligence that enables algorithms to uncover hidden patterns within…

  • What is data migration?

    What is data migration?

    Data migration is the process of transferring data from one storage system or computing environment to another. Data…

  • What is a Risk Management Strategy?

    What is a Risk Management Strategy?

    A risk management strategy is a structured approach to identifying, assessing, and mitigating risks that can impact an…

  • Azure Databricks

    Azure Databricks

    Azure Databricks is a unified, open analytics platform for building, deploying, sharing, and maintaining…

  • What is GitHub?

    What is GitHub?

    GitHub is a web-based version control and collaboration platform for software developers. Microsoft, the biggest single…

  • What Is Strategic Acquisition?

    What Is Strategic Acquisition?

    When companies use strategic acquisition to merge with another firm, they are often looking to gain financial benefits…

  • What is a Risk Management Strategy?

    What is a Risk Management Strategy?

    A risk management strategy is a structured approach to identifying, assessing, and mitigating risks that can impact an…

  • Credit Risk

    Credit Risk

    Credit risk refers to the probability of loss due to a borrower’s failure to make payments on any type of debt. Credit…

  • Database Management System (DBMS)

    Database Management System (DBMS)

    A Database Management System (DBMS) is a software solution designed to efficiently manage, organize, and retrieve data…

  • predictive modeling

    predictive modeling

    Predictive modeling is a mathematical process used to predict future events or outcomes by analyzing patterns in a…

Insights from the community

Others also viewed

Explore topics