In: Computer Science
Python: How would I write a function that takes a URL of a webpage and finds the first link on the page? The hint given is the function should return a tuple holding two strings: the URL and the link text.
Python code for the given problem statement is given below:
from bs4 import BeautifulSoup
html = """
<html><head></head>
<body>
<h2 class='title'><a href='http://www.mypage.com'>My HomePage</a></h2>
<h2 class='title'><a href='http://www.mypage.com/sections'>Sections</a></h2>
</body>
</html>
"""
#you can use your intended webpage by reading the webpage as:
#with open("index.html") as fp:
# soup = BeautifulSoup(fp, 'html.parser')
soup = BeautifulSoup(html, 'html.parser')
def extractUrlText():
url = soup.find('a', href=True)['href']
urlText = soup.find('a', text=True).get_text(strip=True)
return (url, urlText)
print(extractUrlText())
Use the extractUrlText() function to achieve the required result.
Sample Output:
Here, according to the question the url and the link text is returned in the form of a tuple.
I hope you find the solution helpful.
Keep Learning!