Python | Sorting URL on basis of Top Level Domain
Last Updated :
11 May, 2020
Given a list of URL, the task is to sort the URL in the list based on the top-level domain.
A
top-level domain (TLD) is one of the domains at the highest level in the hierarchical Domain Name System of the Internet. Example - org, com, edu.
This is mostly used in a case where we have to scrap the pages and sort URL according to top-level domain. It is widely used in open-source projects and serves as handy snippet for use.
Input :
url = ["https://www.isb.edu", "www.google.com",
"http://cyware.com", "https://www.gst.in",
"https://www.coursera.org", "https://www.create.net",
"https://www.ontariocolleges.ca"]
Output :
['https://www.ontariocolleges.ca', 'www.google.com',
'http://cyware.com', 'https://www.isb.edu',
'https://www.gst.in', 'https://www.create.net',
'https://www.coursera.org']
Explanation:
The Tld for the above list is in sorted order
['.ca','.com','.com','.edu','.in','.net','.org']
Below are some ways to do the above task.
Method 1: Using sorted
You can split the input and then use sorting to sort according to TLD.
Python3
#Python code to sort the URL in the list based on the top-level domain.
#Url list initialization
Input = ["https://www.isb.edu", "www.google.com", "http://cyware.com",
"https://www.gst.in", "https://www.coursera.org",
"https://www.create.net", "https://www.ontariocolleges.ca"]
#Function to sort in tld order
def tld(Input):
return Input.split('.')[-1]
#Using sorted and calling function
Output = sorted(Input,key=tld)
#Printing output
print("Initial list is :")
print(Input)
print("sorted list according to TLD is")
print(Output)
Initial list is :
['https://www.isb.edu', 'www.google.com', 'http://cyware.com',
'https://www.gst.in', 'https://www.coursera.org',
'https://www.create.net', 'https://www.ontariocolleges.ca']
Sorted list according to TLD is :
['https://www.ontariocolleges.ca', 'www.google.com',
'http://cyware.com', 'https://www.isb.edu',
'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']
Method 2: Using Lambda
The most concise and readable way to sort the URL in the list based on the top-level domain is using lambda.
Python3
#Python code to sort the URL in the list based on the top-level domain.
#Url list initialization
Input = ["https://www.isb.edu", "www.google.com", "http://cyware.com",
"https://www.gst.in", "https://www.coursera.org",
"https://www.create.net", "https://www.ontariocolleges.ca"]
#Using lambda and sorted
Output = sorted(Input,key=lambda x: x.split('.')[-1])
#Printing output
print("Initial list is :")
print(Input)
print("sorted list according to TLD is")
print(Output)
Initial list is :
['https://www.isb.edu', 'www.google.com', 'http://cyware.com',
'https://www.gst.in', 'https://www.coursera.org',
'https://www.create.net', 'https://www.ontariocolleges.ca']
Sorted list according to TLD is :
['https://www.ontariocolleges.ca', 'www.google.com',
'http://cyware.com', 'https://www.isb.edu',
'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']
Method 3: Using reversed
Reversing the input and splitting it and then applying a sort to sort URL according to TLD
Python3
#Python code to sort the URL in the list based on the top-level domain.
#Url list initialization
Input = ["https://www.isb.edu", "www.google.com", "http://cyware.com",
"https://www.gst.in", "https://www.coursera.org",
"https://www.create.net", "https://www.ontariocolleges.ca"]
#Internal function for reversed
def internal(string):
return list(reversed(string.split('.')))
#Using sorted and calling internal for reversed
Output = sorted(Input, key=internal)
#Printing output
print("Initial list is :")
print(Input)
print("sorted list according to TLD is")
print(Output)
Initial list is :
['https://www.isb.edu', 'www.google.com', 'http://cyware.com',
'https://www.gst.in', 'https://www.coursera.org',
'https://www.create.net', 'https://www.ontariocolleges.ca']
Sorted list according to TLD is :
['https://www.ontariocolleges.ca', 'www.google.com',
'http://cyware.com', 'https://www.isb.edu',
'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']
Similar Reads
Python | How to shorten long URLs using Bitly API
Bitly is used to shorten, brand, share, or retrieve data from links programmatically. In this article, we'll see how to shorten URLs using Bitly API. Below is a working example to shorten a URL using Bitly API. Step #1: Install Bitly API using git git clone https://github.com/bitly/bitly-api-python.
2 min read
Check for URL in a String - Python
We are given a string that may contain one or more URLs and our task is to extract them efficiently. This is useful for web scraping, text processing, and data validation. For example:Input:s = "My Profile: https://auth.geeksforgeeks.org/user/Prajjwal%20/articles in the portal of https://www.geeksfo
3 min read
How to Sort a Set of Values in Python?
Sorting means arranging the set of values in either an increasing or decreasing manner. There are various methods to sort values in Python. We can store a set or group of values using various data structures such as list, tuples, dictionaries which depends on the data we are storing. We can sort val
7 min read
Requesting a URL from a local File in Python
Making requests over the internet is a common operation performed by most automated web applications. Whether a web scraper or a visitor tracker, such operations are performed by any program that makes requests over the internet. In this article, you will learn how to request a URL from a local File
4 min read
Building CLI to check status of URL using Python
In this article, we will build a CLI(command-line interface) program to verify the status of a URL using Python. The python CLI takes one or more URLs as arguments and checks whether the URL is accessible (or)not. Stepwise ImplementationStep 1: Setting up files and Installing requirements First, cr
4 min read
Python - Ways to sort list of strings in case-insensitive manner
Sorting strings in a case-insensitive manner ensures that uppercase and lowercase letters are treated equally during comparison. To do this we can use multiple methods like sorted(), str.casefold(), str.lower() and many more. For example we are given a list of string s = ["banana", "Apple", "cherry"
2 min read
Python | Inverse Sorting String
Sometimes, while participating in a competitive programming test, we can be encountered with a problem in which we require to sort a pair in opposite orders by indices. This particular article focuses on solving a problem in which we require to sort the number in descending order and then the String
3 min read
Python | Sort a tuple by its float element
In this article, we will see how we can sort a tuple (consisting of float elements) using its float elements. Here we will see how to do this by using the built-in method sorted() and how can this be done using in place method of sorting. Examples: Input : tuple = [('lucky', '18.265'), ('nikhil', '1
3 min read
How to Urlencode a Querystring in Python?
URL encoding a query string consists of converting characters into a format that can be safely transmitted over the internet. This process replaces special characters with a '%' followed by their hexadecimal equivalent. In this article, we will explore three different approaches to urlencode a query
2 min read
Comparing path() and url() (Deprecated) in Django for URL Routing
When building web applications with Django, URL routing is a fundamental concept that allows us to direct incoming HTTP requests to the appropriate view function or class. Django provides two primary functions for defining URL patterns: path() and re_path() (formerly url()). Although both are used f
8 min read