Friday, 25 March 2022

Hugging Face Introduction

 Hugging face provides libraries for Natural Language Processing (NLP) using transformers. Hugging face can be use for following tasks

  • Sentiment Analysis - Provides the score in terms of positive and negative
  • Zero Shot Classification-It classifies a text into the mention topics by allocating percentage to each topic.
  • Text Generation - Generates the summary basis on short text passed
  • Mask Filling-If a word is hidden in a string, this method is used for prediction of the word
  • Named Entity Recognition-It classifies the entities into predefined categories such as organization, locations, quantities, etc.
  • Question Answering - Basis on the context passed in pipeline, this feature answers the questions
  • Summarization - It summarizes the long text into short summary
  • Translation - Translates the text from one language to other

 

Monday, 14 March 2022

Python libraries for Natural Language Processing

 Python provides multiple libraries to handle text data,some of the prominent libraries are

1) Natural Language Toolkit (NLTK)
One of the most commonly used library for tokenization, lemmatization ,stemming,parsing, chunking and POS tagging.

2) Gensim
This library is majorly used for topic modelling,document indexing and similarity retrieval with large corpus.Algorithms in Gensim are memory independent.

3) spaCy - Open source NLP library is majorly used for production usage for huge volumes of data.It supports tokenization for over 49 languages.