Python Tutorial

Friday, November 23, 2012

Design pattern in python : Factory method


Factory method pattern in python.
All source code available on github

class Book:
    def book_category(self):    pass

class PythonBook(Book):
    def book_category(self):
        print "Python book"

class JavaBook(Book):
    def book_category(self):
        print "Java book"

class BookFactory:
    def get_book(self, book_type):
        if book_type=='python':
            return PythonBook()
        elif book_type=='java':
            return JavaBook()
        else:
            return None

bookFactory = BookFactory()
pythonBook = bookFactory.get_book('python')
pythonBook.book_category()

javaBook = bookFactory.get_book('java')
javaBook.book_category()



Output:
Python book
Java book

Design pattern in python: Template method


Template method design pattern in python.
All source code available on github

class MakeMeal:

    def prepare(self):  pass
    def cook(self): pass
    def eat(self):  pass

    def go(self):
        self.prepare()
        self.cook()
        self.eat()

class MakePizza(MakeMeal):

    def prepare(self):
        print "Prepare Pizza"

    def cook(self):
        print "Cook Pizza"

    def eat(self):
        print "Eat Pizza"

class MakeTea(MakeMeal):

    def prepare(self):
        print "Prepare Tea"

    def cook(self):
        print "Cook Tea"

    def eat(self):
        print "Eat Tea"

makePizza = MakePizza()
makePizza.go()

print 25*"+"

makeTea = MakeTea()
makeTea.go()



Output:
Prepare Pizza
Cook Pizza
Eat Pizza
+++++++++++++++++++++++++
Prepare Tea
Cook Tea
Eat Tea

hashlib : secure hashes and message digests


hashlib implements many different secure hash and message digest algorithms(SHA1, SHA224, SHA256, SHA384, SHA512, MD5). Lets have a look...

 
import hashlib

message = "python"

print "md5"
print hashlib.md5(message).hexdigest()

print "sha1"
print hashlib.sha1(message).hexdigest()

print "sha512"
print hashlib.sha512(message).hexdigest()

print "sha224"
print hashlib.sha224(message).hexdigest()

print "sha256"
print hashlib.sha256(message).hexdigest()

print "sha384"
print hashlib.sha384(message).hexdigest()



Output:
md5
23eeeb4347bdd26bfc6b7ee9a3b755dd
sha1
4235227b51436ad86d07c7cf5d69bda2644984de
sha512
ecc579811643b170cbd88fd0d0e323d1e1acc7cef8f73483a70abea01a89afa8015295f617f27447ba05e928e47a0b3a46dc79e72f99d1333856e23eeff97d8b
sha224
dace1c32d56e6f2bd077266a5a381fcf7ff9052e0a269e32cd52a551
sha256
11a4a60b518bf24989d481468076e5d5982884626aed9faeb35b8576fcd223e1
sha384
2690f7fce3051903a4e8b9f1f9ea705f070f03f9d84c353f2653cece80ea68130ef8defd53ef29af5f236e6cac7c7efb

Monday, November 19, 2012

python FuzzyWuzzy : Levenshtein distance


FuzzyWuzzy is very easy to use. Lets see some example.
All source code available on github

from fuzzywuzzy import fuzz
from fuzzywuzzy import process

print fuzz.ratio("this is a test", "this is a test!") # sample ratio
print fuzz.partial_ratio("this is a test", "this is a test!") # partial ratio
# token sort ratio
print fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")

print "Process"
choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
print process.extract("new york jets", choices, limit=2) # find best two choices
print  process.extractOne("cowboys", choices) # find best choice


Output:
97
100
100
Process
[('New York Jets', 100), ('New York Giants', 79)]
('Dallas Cowboys', 90)

Python FuzzyWuzzy : string matching


FuzzyWuzzy find Levenshtein distance between strings. It is very handy for dealing with human-generated data.

Install:
pip install -e git+git://github.com/seatgeek/fuzzywuzzy.git#egg=fuzzywuzzy

Sunday, November 18, 2012

python dateutil : date operation is fun


Here some example code of operate date with dateutil. It very easy, reduce lots of pain & code. I think from now you like date operation much.
All source code available on github

 
from dateutil.parser import parse
from dateutil.relativedelta import *

from datetime import *
print "parse example"
print parse('Mon, 11 Jul 2011 10:01:56 +0200 (CEST)')
s = "Today is 25 of September of 2003, exactly at 10:49:41 with timezone -03:00."
print parse(s, fuzzy=True)


today = datetime.now()
print 20*"=="
print today # today
print today+relativedelta(months=+1) # Next month
print today+relativedelta(years=+1) # Next year
print today+relativedelta(months=+1, weeks=+1) # Next month, plus one week
print today+relativedelta(months=+1, weeks=+1, hour=10) # Next month, plus one week, at 10am
print today+relativedelta(years=+1,months=-1) # One month before one year

print 20*"++"
print today+relativedelta(weekday=FR) # Next friday
print today+relativedelta(days=+1,weekday=SU(+1)) # Next sunday, but not today
print today+relativedelta(day=31, weekday=FR(-1)) # Last friday in this month


print "Calculate Age"
birthday = datetime(1971, 4, 5, 12, 0)
age = relativedelta(today, birthday)  # calculate age
print age
print "years: ",age.years," months: ",age.months," day: ",age.days




Output:
parse example
2011-07-11 10:01:56+02:00
2003-09-25 10:49:41-03:00
========================================
2012-11-18 23:35:31.191000
2012-12-18 23:35:31.191000
2013-11-18 23:35:31.191000
2012-12-25 23:35:31.191000
2012-12-25 10:35:31.191000
2013-10-18 23:35:31.191000
++++++++++++++++++++++++++++++++++++++++
2012-11-23 23:35:31.191000
2012-11-25 23:35:31.191000
2012-11-30 23:35:31.191000
Calculate Age
relativedelta(years=+41, months=+7, days=+13, hours=+11, minutes=+35, seconds=+31, microseconds=+191000)
years:  41  months:  7  day:  13

python dateutil


Python dateutil module provides powerful extensions to the standard datetime module. It reduce pain of handling date. Here main tutorial page.

Install:
pip install python-dateutil

Pyquery

Pyquery is very similar to jquery. It is alows us to make query on xml documents. Pyquery uses lxml for fast xml and html manipulation. It is also very useful for scraping. Pyquery API is very similar to jquery. Lets have a look.

Install:
pip install pyquery

Saturday, November 3, 2012

Tornado: POST API example


Tornado post API example code.
All updated source code available on github

 
import markdown
import os.path
import re
import tornado.auth
import tornado.database
import tornado.httpserver
import tornado.ioloop
import tornado.options
import tornado.web
import unicodedata

from tornado.options import define, options

define("port", default=8080, help="run on the given port", type=int)
define("mysql_host", default="127.0.0.1", help="api database host")
define("mysql_database", default="tornado_api", help="tornado_api database name")
define("mysql_user", default="root", help="tornado_api database user")
define("mysql_password", default="", help="tornado_api database password")


class Application(tornado.web.Application):
    def __init__(self):
        project_dir = os.getcwd()
        #project_dir = 'C:/tornado-2.4/demos/blog/'
        handlers = [
            (r"/all_book/", BooksHandler),
            (r"/all_category/", CategoryHandler),
        ]
        settings = dict(
            #autoescape=None,
        )
        tornado.web.Application.__init__(self, handlers, **settings)

        # Have one global connection to the blog DB across all handlers
        self.db = tornado.database.Connection(
            host=options.mysql_host, database=options.mysql_database,
            user=options.mysql_user, password=options.mysql_password)


class BaseHandler(tornado.web.RequestHandler):
    @property
    def db(self):
        return self.application.db


class BooksHandler(BaseHandler):
    def post(self):
        try:
            print "Adding new book"
            name = self.get_argument("name")
            title = self.get_argument("title")
            author = self.get_argument("author")
            if not name or not title or not author:
                return self.write({"success":False})
            if not len(name) or not len(title) or not len(author):
                return self.write({"success":False})
            print "[ NEW BOOK ] name ",name," title ",title," author ",author
            self.db.execute(
        "INSERT INTO book (name,title,author) VALUES (%s,%s,%s)",name, title,author)
            self.write({"success":True})
        except:
            self.write({"success":False})

class CategoryHandler(BaseHandler):

    def post(self):
        try:
            print "Adding new category"
            name = self.get_argument("name")
            if not name or not len(name):
                return self.write({"success":False})
            print "[ NEW CATEGORY ] name ",name
            self.db.execute(
                "INSERT INTO category (name) VALUES (%s)",name)
            self.write({"success":True})
        except:
            self.write({"success":False})

def main():
    tornado.options.parse_command_line()
    http_server = tornado.httpserver.HTTPServer(Application())
    http_server.listen(options.port)
    tornado.ioloop.IOLoop.instance().start()


if __name__ == "__main__":
    main()



API call:

+ add new book:
  - url : http://127.0.0.1:8080/all_book/
  - param : name, title, author
+ add new category:
  - url : http://127.0.0.1:8080/category/
  - param : name

Tonado: Sample REST APT(GET)


Sample REST API(GET) using Tornado.
Full updated project(with database) is available on Github

 
import os.path
import re
import tornado.auth
import tornado.database
import tornado.httpserver
import tornado.ioloop
import tornado.options
import tornado.web
import unicodedata

from tornado.options import define, options


define("port", default=8080, help="run on the given port", type=int)
define("mysql_host", default="127.0.0.1", help="api database host")
define("mysql_database", default="tornado_api", help="tornado_api database name")
define("mysql_user", default="user", help="tornado_api database user")
define("mysql_password", default="password", help="tornado_api database password")


class Application(tornado.web.Application):
    def __init__(self):
        project_dir = os.getcwd()
  # map all handlers here
  handlers = [
            (r"/", BooksHandler),
            (r"/all_book", BooksHandler),
            (r"/all_category", CategoryHandler),
            (r"/all", AllHandler)
        ]
        settings = dict(
            #autoescape=None,
        )
        tornado.web.Application.__init__(self, handlers, **settings)

        # Have one global connection to the blog DB across all handlers
        self.db = tornado.database.Connection(
            host=options.mysql_host, database=options.mysql_database,
            user=options.mysql_user, password=options.mysql_password)

# this is base handler
class BaseHandler(tornado.web.RequestHandler):
    @property
    def db(self):
        return self.application.db


# this handler is for get all book
class BooksHandler(BaseHandler):
    def get(self):
        entries = self.db.query("SELECT * FROM book WHERE 1")
        result = {}
        result["book"]=[]
        for entrie in entries:
            result["book"].append({"id":entrie.id, "name":entrie.name, "title":entrie.title, "author":entrie.author})
        self.write(result)

# this handler is for get all category
class CategoryHandler(BaseHandler):
    def get(self):
        entries = self.db.query("SELECT * FROM category WHERE 1")
        result = {}
        result["category"]=[]
        for entrie in entries:
            result["category"].append({"id":entrie.id, "name":entrie.name})
        self.write(result)

# this handler is for get all book and category
class AllHandler(BaseHandler):
    def get(self):
        entries = self.db.query("SELECT * FROM category WHERE 1")
        result = {}
        result["category"]=[]
        for entrie in entries:
            result["category"].append({"id":entrie.id, "name":entrie.name})

        entries = self.db.query("SELECT * FROM book WHERE 1")
        result["book"]=[]
        for entrie in entries:
            result["book"].append({"id":entrie.id, "name":entrie.name, "title":entrie.title, "author":entrie.author})
        self.write(result)

# tornado main function
def main():
    tornado.options.parse_command_line()
    http_server = tornado.httpserver.HTTPServer(Application())
    http_server.listen(options.port)
    tornado.ioloop.IOLoop.instance().start()


if __name__ == "__main__":
    main()
 




API call:

http://127.0.0.1:8080/all_book/
http://127.0.0.1:8080/all_category/
http://127.0.0.1:8080/all/

Tornado: non-blocking web server


Tornado is an non-blocking, scalable web server. It can handle thousands of simultaneous standing connections, which means it is ideal for real-time web services

Install:
Download package from http://www.tornadoweb.org/ and install from python.

With tornado you will get many demo projects for helloworld, appengine, auth, blog, chat, facebook, s3server so on.

django hello world


Django is python high level framework. It always encourage rapid development and clean, pragmatic design. It lets you build high-performing, elegant web applications quickly. You can also develop back-end API(django have some cool library for it). Lets go....

Create a django project.

From command prompt:
Create django project: django-admin.py startproject django_hello_world
Create django app: python manage.py startapp hello

I am using pycharm, so I can create my project from pycharm.

Here I show you only which file I need to edit for run hello world project.
You can download full projcet from here

 

// settings.py

# set admin
ADMINS = (
    ('jony', 'jony.cse@gmail.com'),
)

#set database
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.sqlite3', 
        # Add 'postgresql_psycopg2', 'mysql', 'sqlite3' or 'oracle'.
        'NAME': 'dev.db',                      
        # Or path to database file if using sqlite3.
        'USER': '',                      # Not used with sqlite3.
        'PASSWORD': '',                  # Not used with sqlite3.
        'HOST': '',                      
         # Set to empty string for localhost. Not used with sqlite3.
        'PORT': '',                      
        # Set to empty string for default. Not used with sqlite3.
    }
}
INSTALLED_APPS = (
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.sites',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    # Uncomment the next line to enable the admin:
    # 'django.contrib.admin',
    # Uncomment the next line to enable admin documentation:
    # 'django.contrib.admindocs',
    'hello', # need to install your app
)

//views.py

# create a view function which we will call
from django.http import HttpResponse
def hello_view(request):
    return HttpResponse("Hello my django!!")
	
//url.py

#call view from url
from hello import views
urlpatterns = patterns('',
    url(r'^hello_django/$', views.hello_view, name='my_hello_view'),
)




running project:

commands:
python manage.py syncbd 
# when you make chage on db, need to run this command
# create super user
python manage.py runserver 8080

Not hit this url:
http://127.0.0.1:8080/hello_django/

django install


Install from pip:
pip install Django

or
download django package and install by command python manage.py install

Beautiful Soup CSS selector


Beautiful Soup supports a subset of the CSS selector standard. Just construct the selector as a string and pass it into the .select() method of a Tag or the BeautifulSoup object itself.
I used this html file for practice. All source code available on github

 
from pprint import pprint
from bs4 import BeautifulSoup

html_content = open('bs_sample3.html') 
# http://dl.dropbox.com/u/49962071/blog/python/resource/bs_sample3.html
soup = BeautifulSoup(html_content) # making soap

pprint(soup.select("title")) # get title tag
pprint(soup.select("body a")) # all a tag inside body
pprint(soup.select("html head title")) # html->head->title
pprint(soup.select("head > title")) # head->title
pprint(soup.select("p > a")) # all a tag that inside p
pprint(soup.select("body > a")) # all a tag inside body
pprint(soup.select(".sister")) # select by class
pprint(soup.select("#link1")) # select by id
pprint(soup.select('a[href="http://example.com/elsie"]')) 
# find tags by attribute value
pprint(soup.select('a[href^="http://example.com/"]'))
# find tags by attribute value, all contains 'http://example.com/'
pprint(soup.select('p[lang|=en]')) # Match language codes

Friday, November 2, 2012

Beautiful Soup find_all() search API


find_all() is the most popular method in the Beautiful Soup search API. It's reduce your code size massively. We can use regular expression, custom function into it. I used this html file for practice.
All source code available on github

 
from pprint import pprint
import re
from bs4 import BeautifulSoup

html_content = open('bs_sample.html') 
#http://dl.dropbox.com/u/49962071/blog/python/resource/bs_sample.html
soup = BeautifulSoup(html_content) # making soap

for tag in soup.find_all(re.compile("^p")): # find all tag start with p
    print tag.name

for tag in soup.find_all(re.compile("t")): # find all tag contains t
    print tag.name

for tag in soup.find_all(True): # find all tag
    print tag.name

pprint(soup.find_all('a')) # find all a tag
print 20*"++"
pprint(soup.find_all(["a", "b"])) # find multiple tag


def has_class_but_no_id(tag):
    return tag.has_key('class') and not tag.has_key('id')

pprint(soup.find_all(has_class_but_no_id)) 
# pass a function to find_all

pprint(soup.find_all(text=re.compile("sisters"))) 
# find all tag content contains key 'sisters'
print 20*"++"
pprint(soup.find_all(href=re.compile("my_url"))) # all links contains key "my_url"
pprint(soup.find_all(id=True)) # all links has id
pprint(soup.find_all(class_=True)) # all links has class

def has_six_characters(css_class):
    return css_class is not None and len(css_class) == 7

pprint(soup.find_all(class_=has_six_characters)) 
# find all class name contains 7 characters

pprint(soup.find_all("a", "sister")) # find all a tag have class named 'sister'
pprint(soup.find_all("a", re.compile("sister"))) 
# find all a tag have class named contains 'sister'
print 20*"++"

pprint(soup.find_all(href=re.compile("elsie"), id='link1'))
# url name contains elsie and have id = link1
pprint(soup.find_all(attrs={'href' : re.compile("elsie"), 'id': 'link1'})) 
# url name contains elsie and have id = link1

pprint(soup.find_all("a", limit=2)) # use limit on find_all

pprint(soup.html.find_all("title", recursive=True)) # use recursive on findall


explore git


My git notes. I practiced git on github.

 
Git is awesome. I am using git many of my projects. Git has many usefull commands, for me
it is hard to remember all commands. So I make my own git notes, you can make yours.

Beautiful Soup 4 exploring


Quick explore to Beautiful Soup 4. I used this document for practicing.
All source codes of this blog available on github.

 
from pprint import pprint
from bs4 import BeautifulSoup

html_content = open('bs_sample.html') 
# http://dl.dropbox.com/u/49962071/blog/python/resource/bs_sample.html

soup = BeautifulSoup(html_content) # making soap

print soup.prettify() # prettify html_content even complete uncompleted tag

print soup.title # page title tag
print soup.title.name # page title name
print soup.title.parent.name # page title parent
print soup.p # first p tag
print soup.p.string # string content of first p tag
print soup.p['class'] # first p tag class name
print soup.a  # first a tag
pprint( soup.find_all('a'))  # all a tag
pprint( soup.find_all('p'))  # all p tag
print soup.find(id='link3') # find tag with id = link3
print 'All links:'
for link in soup.find_all('a'):
    print link.get('href') # get url

print soup.get_text() # return text part of html_document