Extracting locations from text using Python
Last Updated :
21 Jun, 2022
In this article, we are going to see how to extract location from text using Python.
While working with texts, the requirement can be the detection of cities, regions, states, and countries and relationships between them in the received text. This can be very useful for geographical studies. In this article, we will use the locationtagger library.
Text mining that requires some grammar-based rules and statistical modelling approaches is usually carried using NER (Named Entity Recognition) Algorithms. An entity extracted from NER can be the name of a person, place, organization, or product. The locationtagger library is a byproduct of further tagging and filtering places from all the other entities present.
Installation:
To install this module type the below command in the terminal.
pip install locationtagger
After the installation, a few nltk modules are required to download using code.
Python3
import nltk
import spacy
# essential entity models downloads
nltk.downloader.download('maxent_ne_chunker')
nltk.downloader.download('words')
nltk.downloader.download('treebank')
nltk.downloader.download('maxent_treebank_pos_tagger')
nltk.downloader.download('punkt')
nltk.download('averaged_perceptron_tagger')
Also from the command line:
python -m spacy download en_core_web_sm
Example 1: Printing countries, cities and regions from Text.
Various functions can be used to get cities, countries, regions etc from the text.
Functions Used:
- locationtagger.find_location(text) : Return the entity with location information. The "text" parameter takes text as input.
- entity.countries : Extracts all the countries in text.
- entity.regions : Extracts all the states in text.
- entity.cities : Extracts all the cities in text.
Code:
Python3
import locationtagger
# initializing sample text
sample_text = "India has very rich and vivid culture\
widely spread from Kerala to Nagaland to Haryana to Maharashtra. " \
"Delhi being capital with Mumbai financial capital.\
Can be said better than some western cities such as " \
" Munich, London etc. Pakistan and Bangladesh share its borders"
# extracting entities.
place_entity = locationtagger.find_locations(text = sample_text)
# getting all countries
print("The countries in text : ")
print(place_entity.countries)
# getting all states
print("The states in text : ")
print(place_entity.regions)
# getting all cities
print("The cities in text : ")
print(place_entity.cities)
Output :

Example 2: Extracting Relations of locations
In this example, various functions are discussed which perform the task of getting relations of cities, regions, and states with each other.
Functions Used:
- entity.country_regions : Extracts the country where regions are found in text.
- entity.country_cities : Extracts the country where cities are found in text.
- entity.other_countries : Extracts all countries list whose regions or cities are present in text.
- entity.region_cities : Extracts the regions with whose cities are found in text.
- entity.other_regions : Extracts all regions list whose cities are present in text.
- entity.other : All entities not recognized as place names, are extracted to this.
Python3
import locationtagger
# initializing sample text
sample_text = "India has very rich and vivid culture widely\
spread from Kerala to Nagaland to Haryana to Maharashtra. " \
"Mumbai being financial capital can be said better\
than some western cities such as " \
" Lahore, Canberra etc. Pakistan and Nepal share its borders"
# extracting entities.
place_entity = locationtagger.find_locations(text = sample_text)
# getting all country regions
print("The countries regions in text : ")
print(place_entity.country_regions)
# getting all country cities
print("The countries cities in text : ")
print(place_entity.country_cities)
# getting all other countries
print("All other countries in text : ")
print(place_entity.other_countries)
# getting all region cities
print("The region cities in text : ")
print(place_entity.region_cities)
# getting all other regions
print("All other regions in text : ")
print(place_entity.other_regions)
# getting all other entities
print("All other entities in text : ")
print(place_entity.other)
Output:

Similar Reads
Parsel: How to Extract Text From HTML in Python
Parsel is a Python library used for extracting data from HTML and XML documents. It provides tools for parsing, navigating, and extracting information using CSS selectors and XPath expressions. Parsel is particularly useful for web scraping tasks where you need to programmatically extract specific d
2 min read
Python | Extract URL from HTML using lxml
Link extraction is a very common task when dealing with the HTML parsing. For every general web crawler that's the most important function to perform. Out of all the Python libraries present out there, lxml is one of the best to work with. As explained in this article, lxml provides a number of help
4 min read
Extract IP address from file using Python
Let us see how to extract IP addresses from a file using Python. Algorithm :  Import the re module for regular expression.Open the file using the open() function.Read all the lines in the file and store them in a list.Declare the pattern for IP addresses. The regex pattern is :  r'(\d{1,3}\.\d{1,3
2 min read
Extract title from a webpage using Python
Prerequisite Implementing Web Scraping in Python with BeautifulSoup, Python Urllib Module, Tools for Web Scraping In this article, we are going to write python scripts to extract the title form the webpage from the given webpage URL. Method 1: bs4 Beautiful Soup(bs4) is a Python library for pulling
3 min read
Python Tweepy - Getting the location of a user
In this article we will see how we can get the location of a user. The location of the user account need not be the exact physical location of the user. As the user is free to change their location, the location of the account can even by a hypothetical place. The location attribute is optional and
2 min read
Extract CSS tag from a given HTML using Python
Prerequisite: Implementing Web Scraping in Python with BeautifulSoup In this article, we are going to see how to extract CSS from an HTML document or URL using python. Â Module Needed: bs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come b
2 min read
Create a Nested Dictionary from Text File Using Python
We are given a text file and our task is to create a nested dictionary using Python. In this article, we will see how we can create a nested dictionary from a text file in Python using different approaches. Create a Nested Dictionary from Text File Using PythonBelow are the ways to Create Nested Dic
3 min read
Extraction of Tweets using Tweepy
Introduction: Twitter is a popular social network where users share messages called tweets. Twitter allows us to mine the data of any user using Twitter API or Tweepy. The data will be tweets extracted from the user. The first thing to do is get the consumer key, consumer secret, access key and acce
5 min read
NLP | Location Tags Extraction
Different kind of ChunkParserI subclass can be used to identify the LOCATION chunks. As it uses the gazetteers corpus to identify location words. The gazetteers corpus is a WordListCorpusReader class that contains the following location words: Country names U.S. states and abbreviations Mexican stat
2 min read
Python | Extractive Text Summarization using Gensim
Summarization is a useful tool for varied textual applications that aims to highlight important information within a large corpus. With the outburst of information on the web, Python provides some handy tools to help summarize a text. This article provides an overview of the two major categories of
3 min read