Life is very easy with Python: Python Beautiful Soup Url extract from web page

Saturday, January 28, 2012

Python Beautiful Soup Url extract from web page

from BeautifulSoup import BeautifulSoup, SoupStrainer
import re
import urllib2

def get_url_content(site_url):
    rt=""
    try:
        request = urllib2.Request(site_url) 
        f=urllib2.urlopen(request)
        content=f.read()
        f.close()
    except urllib2.HTTPError, error:
        content=str(error.read())
    return content

response=get_url_content('http://www.sust.edu/')

for link in BeautifulSoup(response, parseOnlyThese=SoupStrainer('a')):
    if link.has_key('href'):
        print link['href']

Output:


All urls under this link

1 comments:

Gorkem Yurtseven said...: Hi,
I am trying to extract soccer game scores to python do you any recommendations how that can be possible?; April 18, 2012 at 7:48 AM

Python Tutorial

Life is very easy with Python

Saturday, January 28, 2012

Python Beautiful Soup Url extract from web page

1 comments:

Post a Comment

Search This Blog

Followers

About Me

Subjects

Archive