Python Tutorial

Saturday, November 3, 2012

Beautiful Soup CSS selector

Beautiful Soup supports a subset of the CSS selector standard. Just construct the selector as a string and pass it into the .select() method of a Tag or the BeautifulSoup object itself.
I used this html file for practice. All source code available on github

from pprint import pprint
from bs4 import BeautifulSoup

html_content = open('bs_sample3.html') 
soup = BeautifulSoup(html_content) # making soap

pprint("title")) # get title tag
pprint("body a")) # all a tag inside body
pprint("html head title")) # html->head->title
pprint("head > title")) # head->title
pprint("p > a")) # all a tag that inside p
pprint("body > a")) # all a tag inside body
pprint(".sister")) # select by class
pprint("#link1")) # select by id
# find tags by attribute value
# find tags by attribute value, all contains ''
pprint('p[lang|=en]')) # Match language codes

4 comments: said...

thank you very much. i will use it

Abu Zahed Jony said...

you are most welcome

Stephen Ziegler said...

Very nice examples. Thanks.

Penulis Tutorial said...

nice one, thanks.

Post a Comment