Kyle Polich discusses NER in this podcast. My learnings are

  • What is an entity in an unstructured dataset? It depends on the context and the task that the ML algo is trying to accomplish
  • Spacy package is a python package that can do NER
  • NER is used in chatbot applications, semantic search applications
  • Lot of NER packages are good but not great
  • Market research - Parse the brands that were mentioned
  • Wikipedia has a lot of markup - Easy to do NER.
  • NER is also called entity identification, entity chunking or entity extraction
  • Spacy features
    • Topic tagging
    • Tokenization
    • POS tagging
    • Text classification
    • NER
  • You can build your own NER on top of Spacy