NER ( Named Entity cognition in Python )

NER stands for Named Entity Recognition, and it is a natural language processing (NLP) task that involves identifying and classifying named entities in unstructured text into predefined categories such as person names, organization names, locations, dates, monetary values, etc. The goal of NER is to extract meaningful information and provide a structured representation of the text, making it easier for computers to understand and analyze the content.

Real-world applications of NER:

  1. Information retrieval: NER helps in extracting relevant information from large volumes of unstructured text, improving search results, and facilitating efficient information retrieval.
  2. Document summarization: By identifying key entities, NER can assist in generating concise and accurate summaries of documents.
  3. Question-answering systems: NER aids in extracting relevant entities from the question and the text to provide precise answers.
  4. Sentiment analysis: Recognizing entities like people, products, and locations can help in better understanding the sentiments expressed in the text.
  5. Named Entity Linking (NEL): NER can be used to link entities in the text to a knowledge base, connecting entities to their real-world representations.
  6. Financial analysis: Identifying monetary values, companies, and other financial entities can be valuable in analyzing financial news and reports.
  7. Social media analysis: NER is employed to analyze social media content, identify trending topics, and understand public sentiment around specific entities.

Extensive detail for NER in Python:

To perform NER in Python, we can use various libraries such as spaCy, NLTK, and Stanford NER. Here, we'll use spaCy, a popular NLP library, for the demonstration.

Step 1: Install spaCy and download its model


pip install spacy

python -m spacy download en_core_web_sm


import spacy


# Load the spaCy model

nlp = spacy.load("en_core_web_sm")

Step 3: Process the text and extract entities

def extract_entities(text):

  doc = nlp(text)

  entities = [(ent.text, ent.label_) for ent in doc.ents]

  return entities

Step 4: Create a random set of news


random_news = """

Sensex, Nifty close lower as global markets extend sell-off. The BSE Sensex closed 0.16% lower at 66,160.20, while the NSE Nifty50 index ended 0.07% lower at 19,646.00.

Crude oil prices rise as US dollar retreats. Brent crude oil prices rose 0.3% to $84.30 a barrel, while US West Texas Intermediate (WTI) crude oil prices rose 0.2% to $82.10 a barrel.

Gold prices edge higher as dollar weakens. Gold prices rose 0.1% to $1,739.40 an ounce.US stock futures fall as investors await Fed decision. US stock futures fell ahead of the Federal Reserve's interest rate decision later today.

European stocks open lower as investors weigh Fed decision. European stocks opened lower as investors weighed the implications of the Federal Reserve's interest rate decision.

Asian stocks close mixed as investors await Fed decision. Asian stocks closed mixed as investors awaited the Federal Reserve's interest rate decision.When the stock market is in a cautious mood, value investing gain prominence. When the valuations are high, the market may move very carefully. There could be a lot of volatility. The market can also fall sharply. In such a scenario, investors turn cautious and look for stocks that offer some safety. That is why many consider investing in value funds in an uncertain scenario . These funds invest in stocks with reasonable valuations. During volatile phases these schemes fare better than schemes t ..

"""


entities_list = extract_entities(random_news)

print(entities_list)



Important Entities:

PERCENT:

0.3%, 0.2%, 0.1%


MONEY:

66,160.20, 19,646.00, $84.30, $82.10, $1,739.40


NORP:

Asian, European


COMMODITY:

Crude oil prices, Brent crude oil prices, Gold prices


ORG:

Sensex, Nifty, BSE Sensex, NSE Nifty50, US stock futures, Fed, Federal Reserve


GPE:

US


DATE:

today


KRISHNAN N NARAYANAN

Sales Associate at American Airlines

1y

Thanks for posting

Like
Reply

To view or add a comment, sign in

More articles by Vishwajit Sen

  • Exploring new opportunities in Data Science

    Career Objective: Dedicated Data Science and Machine Learning Expert with a passion for driving innovation across…

    1 Comment
  • Technical indicators in the stock market:

    Technical indicators in the stock market are mathematical calculations based on historical price, volume, or open…

  • Preparing data for a recommendation system??

    Preparing data for a recommendation system involves organizing and structuring the data in a format that is suitable…

  • Pooling and Padding in CNN??

    Pooling is a down-sampling operation commonly used in convolutional neural networks to reduce the spatial dimensions…

  • What is Computer Vision??

    Computer vision is a multidisciplinary field that enables machines to interpret, analyze, and understand the visual…

  • PRUNING in Decision Trees

    Pruning is a technique used in decision tree algorithms to prevent overfitting and improve the generalization ability…

    2 Comments
  • "NO" need to check for multicollinearity or remove correlated variables explicitly when using decision trees.

    Multicollinearity is a phenomenon in which two or more independent variables in a regression model are highly…

  • MLOps concepts

    MLOps, short for Machine Learning Operations, is a set of practices and tools that combines machine learning (ML) and…

  • Python library & It's Uses

    NumPy: Numerical computing library for arrays, matrices, and mathematical functions. Pandas: Data manipulation and…

  • How much do you know about Weight initialization in Neural Networks ??

    Weight initialization is a crucial step in training neural networks. It involves setting the initial values of the…

    1 Comment

Insights from the community

Others also viewed

Explore topics