Python Advance Module Reference - JSON,requests and logging
(python interview question )
Write a python script to read a JSON file and verify weblinks provided for each student is valid and save the URLs in a list.
Here is the sample json file provided for parsing.
# sample.json file
{
"count": 4,
"students": [
{
"id": 1,
"name": "Alice",
"age": 20,
"weblink": "https://www.google.com",
"grades": {
"math": 90,
"science": 85,
"history": 88
}
},
{
"id": 2,
"name": "Bob",
"age": 21,
"weblink": "https://github.com",
"grades": {
"math": 88,
"science": 82,
"history": 90
}
},
{
"id": 3,
"name": "Charlie",
"age": 19,
"weblink": "https://w3schools.com/python/demopage.htm",
"grades": {
"math": 92,
"science": 80,
"history": 85
}
},
{
"id": 4,
"name": "Delta",
"age": 19,
"weblink": "",
"grades": {
"math": 92,
"science": 80,
"history": 85
}
}
]
}
We will be using three reference in this article.
json - will help to parse through json values from any data source. This module enables you to work with JSON (JavaScript Object Notation) data, which is commonly used for data interchange between different systems and languages due to its simplicity and human-readable format. Two common functions include json.dumps() which will convert python dictionaries, lists or other objects in to JSON format And the other function is json.load() which will convert JSON string into data or list. Refer to below code, where json.load contains entire content of sample.json file and store the value as a data dump.
requests - library helps in making HTTP requests and interacting with web services and APIs. requests.get(url string) will make a http request to gather web data and requests.post(string) will post data into the internet link. In this exercise we will use reqests.get (url) to obtain web data. Value return is stored as record, which then has three methods - record.status_code, record.headers, record.text. Refer to below code, for usage.
logging - Its purpose is to enable developers to track events that occur during the execution of a program for monitoring, debugging, and error handling purposes. to simply put, data is printed in standard output depending upon log level. In the code we have configured log level as INFO.
# Program to read a JSON file , parse through the data,
# verify link for each student record is valid and
# store only valid URLs in a list
import json
import requests
import logging
# Load JSON data from file
with open('sample.json') as f:
dump = json.load(f)
weblist =[]
# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
for link in dump["students"]:
logging.info ("Checking Link: " + link['weblink'])
try:
x = requests.get(link['weblink'],timeout=5)
if x.status_code == 200:
weblist.append(link['weblink'])
else:
logging.error("Status: Failed to connect or HTTP status is not 200 %s", link['weblink'])
except:
logging.error ("Weblink is blank or erorr in link " + link['weblink'] + "end of error")
print ("Final valid weblist is : ", weblist)
Above code snippet is self explanatory, using while, for loops and if conditions. For any queries, feel free to write a comment, will be happy to discuss.