| CARVIEW |
Planet Python
Last update: March 09, 2011 03:45 AM
March 08, 2011
Will Kahn-Greene
Python Software Foundation Grant for Python Miro Community
A couple of weeks ago at Carl's urging, I applied for a grant from the Python Software Foundation. This would cover Miro Community service costs for the next year as well as work on a series of improvements to the site. Things like:
- Universal Subtitles support
- using transcriptions in the search corpus for videos
- implementing an API in Miro Community allowing for automated data validation
I talked about all this at length in my call for funding.
I'm very pleased to announce that the PSF has awarded me a grant. I know how selective they are in their grant approval and I really appreciate this. It helps me a ton and I will work hard to make it money well spent.
I'll be at PyCon 2011.. I hope to spend some time with Carl, Asheesh and others working on Miro Community. I'm also hoping to talk with people who've used the site and what kinds of things we can make better going forward. If you see me, feel free to say, "Hi!"
Roberto Alsina
OK, so THAT is how much browser I can put in 128 lines of code.
I have already posted a couple of times (1, 2) about De Vicenzo , an attempt to implement the rest of the browser, starting with PyQt's WebKit... limiting myself to 128 lines of code.
Of course I could do more, but I have my standards!
- No using ;
- No if whatever: f()
Other than that, I did a lot of dirty tricks, but right now, it's a fairly complete browser, and it has 127 lines of code (according to sloccount) so that's enough playing and it's time to go back to real work.
But first, let's consider how some features were implemented (I'll wrap the lines so they page stays reasonably narrow), and also look at the "normal" versions of the same (the "normal" code is not tested, please tell me if it's broken ;-).
This is not something you should learn how to do. In fact, this is almost a treatise on how not to do things. This is some of the least pythonic, less clear code you will see this week.
It is short, and it is expressive. But it is ugly.
I'll discuss this version.
Proxy Support
A browser is not much of a browser if you can't use it without a proxy, but luckily Qt's network stack has good proxy support. The trick was configuring it.
De Vicenzo supports HTTP and SOCKS proxies by parsing a http_proxy environment variable and setting Qt's application-wide proxy:
proxy_url = QtCore.QUrl(os.environ.get('http_proxy', '')) QtNetwork.QNetworkProxy.setApplicationProxy(QtNetwork.QNetworkProxy(\ QtNetwork.QNetworkProxy.HttpProxy if unicode(proxy_url.scheme()).startswith('http')\ else QtNetwork.QNetworkProxy.Socks5Proxy, proxy_url.host(),\ proxy_url.port(), proxy_url.userName(), proxy_url.password())) if\ 'http_proxy' in os.environ else None
How would that look in normal code?
if 'http_proxy' in os.environ: proxy_url = QtCore.QUrl(os.environ['http_proxy']) if unicode(proxy_url.scheme()).starstswith('http'): protocol = QtNetwork.QNetworkProxy.HttpProxy else: protocol = QtNetwork.QNetworkProxy.Socks5Proxy QtNetwork.QNetworkProxy.setApplicationProxy( QtNetwork.QNetworkProxy( protocol, proxy_url.host(), proxy_url.port(), proxy_url.userName(), proxy_url.password()))
As you can see, the main abuses against python here are the use of the ternary operator as a one-line if (and nesting it), and line length.
Persistent Cookies
You really need this, since you want to stay logged into your sites between sessions. For this, first I needed to write some persistence mechanism, and then save/restore the cookies there.
Here's how the persistence is done (settings is a global QSettings instance):
def put(self, key, value): "Persist an object somewhere under a given key" settings.setValue(key, json.dumps(value)) settings.sync() def get(self, key, default=None): "Get the object stored under 'key' in persistent storage, or the default value" v = settings.value(key) return json.loads(unicode(v.toString())) if v.isValid() else default
It's not terribly weird code, except for the use of the ternary operator in the last line. The use of json ensures that as long as reasonable things are persisted, you will get them with the same type as you put them without needing to convert them or call special methods.
So, how do you save/restore the cookies? First, you need to access the cookie jar. I couldn't find whether there is a global one, or a per-webview one, so I created a QNetworkCookieJar in line 24 and assign it to each web page in line 107.
# Save the cookies, in the window's closeEvent self.put("cookiejar", [str(c.toRawForm()) for c in self.cookies.allCookies()]) # Restore the cookies, in the window's __init__ self.cookies.setAllCookies([QtNetwork.QNetworkCookie.parseCookies(c)[0]\ for c in self.get("cookiejar", [])])
Here I confess I am guilty of using list comprehensions when a for loop would have been the correct thing.
I use the same trick when restoring the open tabs, with the added misfeature of using a list comprehension and throwing away the result:
# get("tabs") is a list of URLs [self.addTab(QtCore.QUrl(u)) for u in self.get("tabs", [])]
Using Properties and Signals in Object Creation
This is a feature of recent PyQt versions: if you pass property names as keyword arguments when you create an object, they are assigned the value. If you pass a signal as a keyword argument, they are connected to the given value.
This is a really great feature that helps you create clear, local code, and it's a great thing to have. But if you are writing evil code... well, you can go to hell on a handbasket using it.
This is all over the place in De Vicenzo, and here's one example (yes, this is one line):
QtWebKit.QWebView.__init__(self, loadProgress=lambda v:\ (self.pbar.show(), self.pbar.setValue(v)) if self.amCurrent() else\ None, loadFinished=self.pbar.hide, loadStarted=lambda:\ self.pbar.show() if self.amCurrent() else None, titleChanged=lambda\ t: container.tabs.setTabText(container.tabs.indexOf(self), t) or\ (container.setWindowTitle(t) if self.amCurrent() else None))
Oh, boy, where do I start with this one.
There are lambda expressions used to define the callbacks in-place instead of just connecting to a real function or method.
There are lambdas that contain the ternary operator:
loadStarted=lambda:\ self.pbar.show() if self.amCurrent() else None
There are lambdas that use or or a tuple to trick python into doing two things in a single lambda!
loadProgress=lambda v:\ (self.pbar.show(), self.pbar.setValue(v)) if self.amCurrent() else\ None
I won't even try to untangle this for educational purposes, but let's just say that line contains what should be replaced by 3 methods, and should be spread over 6 lines or more.
Download Manager
Ok, calling it a manager is overreaching, since you can't stop them once they start, but hey, it lets you download things and keep on browsing, and reports the progress!
First, on line 16 I created a bars dictionary for general bookkeeping of the downloads.
Then, I needed to delegate the unsupported content to the right method, and that's done in lines 108 and 109
What that does is basically that whenever you click on something WebKit can't handle, the method fetch will be called and passed the network request.
def fetch(self, reply): destination = QtGui.QFileDialog.getSaveFileName(self, \ "Save File", os.path.expanduser(os.path.join('~',\ unicode(reply.url().path()).split('/')[-1]))) if destination: bar = QtGui.QProgressBar(format='%p% - ' + os.path.basename(unicode(destination))) self.statusBar().addPermanentWidget(bar) reply.downloadProgress.connect(self.progress) reply.finished.connect(self.finished) self.bars[unicode(reply.url().toString())] = [bar, reply,\ unicode(destination)]
No real code golfing here, except for long lines, but once you break them reasonably, this is pretty much the obvious way to do it:
- Ask for a filename
- Create a progressbar, put it in the statusbar, and connect it to the download's progress signals.
Then, of course, we need ths progress slot, that updates the progressbar:
progress = lambda self, received, total:\ self.bars[unicode(self.sender().url().toString())][0]\ .setValue(100. * received / total)
Yes, I defined a method as a lambda to save 1 line. [facepalm]
And the finished slot for when the download is done:
def finished(self): reply = self.sender() url = unicode(reply.url().toString()) bar, _, fname = self.bars[url] redirURL = unicode(reply.attribute(QtNetwork.QNetworkRequest.\ RedirectionTargetAttribute).toString()) del self.bars[url] bar.deleteLater() if redirURL and redirURL != url: return self.fetch(redirURL, fname) with open(fname, 'wb') as f: f.write(str(reply.readAll()))
Notice that it even handles redirections sanely! Beyond that, it just hides the progress bar, saves the data, end of story. The longest line is not even my fault!
There is a big inefficiency in that the whole file is kept in memory until the end. If you download a DVD image, that's gonna sting.
Also, using with saves a line and doesn't leak a file handle, compared to the alternatives.
Printing
Again Qt saved me, because doing this manually would have been a pain. However, it turns out that printing is just ... there? Qt, specially when used via PyQt is such an awesomely rich environment.
self.previewer = QtGui.QPrintPreviewDialog(\ paintRequested=self.print_) self.do_print = QtGui.QShortcut("Ctrl+p",\ self, activated=self.previewer.exec_)
There's not even any need to golf here, that's exactly as much code as you need to hook Ctrl+p to make a QWebView print.
Other Tricks
There are no other tricks. All that's left is creating widgets, connecting things to one another, and enjoying the awesome experience of programming PyQt, where you can write a whole web browser (except the engine) in 127 lines of code.
Imaginary Landscape
Security for Mobile Applications
Imaginary Landscape has been putting significant effort into developing usable and secure mobile websites. Understanding the context of a mobile user is the first step in developing security protocols to protect mobile access and information.
A new blog posting on the Imaginary Landscape main website describes how we approach mobile ...
Matt Harrison
PyCon, a new job and throwing out 70% of your servers
PyCon is coming up. I'm looking forward to some warmer climes after shoveling 8 inches of heavy snow earlier today. I'll be teaching 2 tutorials, Beginner and Intermediate Hands-on Python, and will post materials for those soon.
Earlier this year, I began working for a purveyor of fast storage devices, Fusion-IO. We make enterprise storage that makes up for Moore's Law not really working for spinning disk. Enough with the marketing talk, here's a post by Jeremy Zawodny, on how Craigslist used Fusion-IO devices to go from 14 overloaded servers to 4 underutilized servers. Needless to say, these things are selling like hotcakes and we use a lot of Python. BTW, we are hiring and there will be a few other guys from Fusion-IO at PyCon. Feel free to inquire.
In other news, I've joined the 21st century and am slowly ramping up on using my twitter account, "dunder mharrison".
Grig Gheorghiu
Monitoring is for ops what testing is for dev
Devops. It's the new buzzword. Go to any tech conference these days and you're sure to find an expert panel on the 'what' and 'why' of devops. These panels tend to be light on the 'how', because that's where the rubber meets the road. I tried to give a step-by-step description of how you can become a Ninja Rockstar Internet Samurai devops in my blog post on 'How to whip your infrastructure into shape'.
Here I just want to say that I am struck by the parallels that exist between the activities of developer testing and operations monitoring. It's not a new idea by any means, but it's been growing on me recently.
Test-infected vs. monitoring-infected
Good developers are test-infected. It doesn't matter too much whether they write tests before or after writing their code -- what matters is that they do write those tests as soon as possible, and that they don't consider their code 'done' until it has a comprehensive suite of tests. And of course test-infected developers are addicted to watching those dots in the output of their favorite test runner.
Good ops engineers are monitoring-infected. They don't consider their infrastructure build-out 'done' until it has a comprehensive suite of monitoring checks, notifications and alerting rules, and also one or more dashboard-type systems that help them visualize the status of the resources in the infrastructure.
Adding tests vs. adding monitoring checks
Whenever a bug is found, a good developer will add a unit test for it. It serves as a proof that the bug is now fixed, and also as a regression test for that bug.
Whenever something unexpectedly breaks within the systems infrastructure, a good ops engineer will add a monitoring check for it, and if possible a graph showing metrics related to the resource that broke. This ensures that alerts will go out in a timely manner next time things break, and that correlations can be made by looking at the metrics graphs for the various resources involved.
Ignoring broken tests vs. ignoring monitoring alerts
When a test starts failing, you can either fix it so that the bar goes green, or you can ignore it. Similarly, if a monitoring alert goes off, you can either fix the underlying issue, or you can ignore it by telling yourself it's not really critical.
The problem with ignoring broken tests and monitoring alerts is that this attitude leads slowly but surely to the Broken Window Syndrome. You train yourself to ignore issues that sooner or later will become critical (it's a matter of when, not if).
A good developer will make sure there are no broken tests in their Continuous Integration system, and a good ops engineer will make sure all alerts are accounted for and the underlying issues fixed.
Improving test coverage vs. improving monitoring coverage
Although 100% test coverage is not sufficient for your code to be bug-free, still, having something around 80-90% code coverage is a good measure that you as a developer are disciplined in writing those tests. This makes you sleep better at night and gives you pride in producing quality code.
For ops engineers, sleeping better at night is definitely directly proportional to the quantity and quality of the monitors that are in place for their infrastructure. The more monitors, the better the chances that issues are caught early and fixed before they escalate into the dreaded 2 AM pager alert.
Measure and graph everything
The more dashboards you have as a devops, the better insight you have into how your infrastructure behaves, from both a code and an operational point of view. I am inspired in this area by the work that's done at Etsy, where they are graphing every interesting metric they can think of (see their 'Measure Anything, Measure Everything' blog post).
As a developer, you want to see your code coverage graphs showing decent values, close to that mythical 100%. As an ops engineer, you want to see uptime graphs that are close to the mythical 5 9's.
But maybe even more importantly, you want insight into metrics that tie directly into your business. At Evite, processing messages and sending email reliably is our bread and butter, so we track those processes closely and we have dashboards for metrics related to them. Spikes, either up or down, are investigated quickly.
Here are some examples of the dashboards we have. For now these use homegrown data collection tools and the Google Visualization API, but we're looking into using Graphite soon.
Outgoing email messages in the last hour (spiking at close to 100 messages/second):
Percentage of errors across some of our servers:
Associated with these metrics we have Nagios alerts that fire when certain thresholds are being met. This combination allows our devops team to sleep better at night.
Python User Groups
pyCologne Python User Group Cologne - Meeting, March 9, 2011, 6.30pm
The next meeting of pyCologne will take place:
Wednesday, March, 9th starting about 6.30 pm - 6.45 pm
at Room 0.14, Benutzerrechenzentrum (RRZK-B)
University of Cologne, Berrenrather Str. 136, 50937 Köln, Germany
Any presentations, news, book presentations etc. are welcome on each of our meetings!
At about 8.30 pm we will as usual enjoy the rest of the evening in a nearby restaurant.
Further information including directions how to get to the location can be found at:
https://www.pycologne.de
(Sorry, the web-links are in German only.)
Mikko Ohtamaa
Installing and using Scrapy web crawler to search text on multiple sites
Here is a little script to use Scrapy, a web crawling framework for Python, to search sites for references for certain texts including link content and PDFs. This is handy for cases where you need to find links violating the user policy, trademarks which are not allowed or just to see where your template output is being used. Our Scrapy example differs from a normal search engine as it does HTML source code level checking: you can also search for CSS classes, link targets and other elements which may be invisible for normal search engines.
Scrapy comes with a command-line tool and project skeleton generator. You need to generate your own Scrapy project to where you can then add your own spider classes.
Install Scrapy using Distribute (or setuptools):
easy_install Scrapy
Create project code skeleton:
scrapy startproject myscraper
Add your spider class skeleton by creating a file myscraper/spiders/spiders.py:
from scrapy.contrib.spiders import CrawlSpider, Rule from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor class MySpider(CrawlSpider): "carview.php?tsp="" Crawl through web sites you specify "carview.php?tsp="" name = "mycrawler" # Stay within these domains when crawling allowed_domains = ["www.mysite.com"] start_urls = [ "https://www.mysite.com/", ] # Add our callback which will be called for every found link rules = [ Rule(SgmlLinkExtractor(), follow=True) ]
Start Scrapy to test it’s crawling properly. Run the following the top level directoty:
scrapy crawl mycrawler
You should see output like:
2011-03-08 15:25:52+0200 [scrapy] INFO: Scrapy 0.12.0.2538 started (bot: myscraper) 2011-03-08 15:25:52+0200 [scrapy] DEBUG: Enabled extensions: TelnetConsole, SpiderContext, WebService, CoreStats, MemoryUsage, CloseSpider 2011-03-08 15:25:52+0200 [scrapy] DEBUG: Enabled scheduler middlewares: DuplicatesFilterMiddleware
You can hit CTRL+C to interrupt scrapy.
Then let’s enhance the spider a bit to search for a blacklisted tags, with optional whitelisting in myscraper/spiders/spiders.py. We use also pyPdf library to crawl inside PDF files:
"carview.php?tsp=""
A sample crawler for seeking a text on sites.
"carview.php?tsp=""
import StringIO
from functools import partial
from scrapy.http import Request
from scrapy.spider import BaseSpider
from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.item import Item
def find_all_substrings(string, sub):
"carview.php?tsp=""
https://code.activestate.com/recipes/499314-find-all-indices-of-a-substring-in-a-given-string/
"carview.php?tsp=""
import re
starts = [match.start() for match in re.finditer(re.escape(sub), string)]
return starts
class MySpider(CrawlSpider):
"carview.php?tsp="" Crawl through web sites you specify "carview.php?tsp=""
name = "mycrawler"
# Stay within these domains when crawling
allowed_domains = ["www.mysite.com", "www.mysite2.com", "intranet.mysite.com"]
start_urls = [
"https://www.mysite.com/",
"https://www.mysite2.com/",
"https://intranet.mysite.com/"
]
# Add our callback which will be called for every found link
rules = [
Rule(SgmlLinkExtractor(), follow=True, callback="check_violations")
]
# How many pages crawled? XXX: Was not sure if CrawlSpider is a singleton class
crawl_count = 0
# How many text matches we have found
violations = 0
def get_pdf_text(self, response):
"carview.php?tsp="" Peek inside PDF to check possible violations.
@return: PDF content as searcable plain-text string
"carview.php?tsp=""
try:
from pyPdf import PdfFileReader
except ImportError:
print "Needed: easy_install pyPdf"
raise
stream = StringIO.StringIO(response.body)
reader = PdfFileReader(stream)
text = u"carview.php?tsp="
if reader.getDocumentInfo().title:
# Title is optional, may be None
text += reader.getDocumentInfo().title
for page in reader.pages:
# XXX: Does handle unicode properly?
text += page.extractText()
return text
def check_violations(self, response):
"carview.php?tsp="" Check a server response page (file) for possible violations "carview.php?tsp=""
# Do some user visible status reporting
self.__class__.crawl_count += 1
crawl_count = self.__class__.crawl_count
if crawl_count % 100 == 0:
# Print some progress output
print "Crawled %d pages" % crawl_count
# Entries which are not allowed to appear in content.
# These are case-sensitive
blacklist = ["meat", "ham" ]
# Enteries which are allowed to appear. They are usually
# non-human visible data, like CSS classes, and may not be interesting business wise
exceptions_after = [ "meatball",
"hamming",
"hamburg"
]
# These are predencing string where our match is allowed
exceptions_before = [
"bushmeat",
"honeybaked ham"
]
url = response.url
# Check response content type to identify what kind of payload this link target is
ct = response.headers.get("content-type", "carview.php?tsp=").lower()
if "pdf" in ct:
# Assume a PDF file
data = self.get_pdf_text(response)
else:
# Assume it's HTML
data = response.body
# Go through our search goals to identify any "bad" text on the page
for tag in blacklist:
substrings = find_all_substrings(data, tag)
# Check entries against the exception list for "allowed" special cases
for pos in substrings:
ok = False
for exception in exceptions_after:
sample = data[pos:pos+len(exception)]
if sample == exception:
#print "Was whitelisted special case:" + sample
ok = True
break
for exception in exceptions_before:
sample = data[pos - len(exception) + len(tag): pos+len(tag) ]
#print "For %s got sample %s" % (exception, sample)
if sample == exception:
#print "Was whitelisted special case:" + sample
ok = True
break
if not ok:
self.__class__.violations += 1
print "Violation number %d" % self.__class__.violations
print "URL %s" % url
print "Violating text:" + tag
print "Position:" + str(pos)
piece = data[pos-40:pos+40].encode("utf-8")
print "Sample text around position:" + piece.replace("\n", " ")
print "------"
# We are not actually storing any data, return dummy item
return Item()
def _requests_to_follow(self, response):
if getattr(response, "encoding", None) != None:
# Server does not set encoding for binary files
# Do not try to follow links in
# binary data, as this will break Scrapy
return CrawlSpider._requests_to_follow(self, response)
else:
return []
Let’s tune down logging output level, so we get only relevant data in the output. In myscaper/settings.py add:
LOG_LEVEL="INFO"
Now you can run the crawler and pipe the output to a text file:
scrapy crawl mycrawler > violations.txt
More information
Read our blog
Subscribe mFabrik blog in a reader
Follow me on Twitter
Wingware
Wingware at PyCon 2011
Wingware will be at PyCon 2011 Friday through Monday this coming weekend (March 11th-14th). For those attending the conference: Please stop by to see us and pick up some Wingware swag at booth 321 in the Expo Hall on Friday or Saturday. We will also be participating in the Python IDE Panel on Saturday at 11:45AM in Centennial I and are planning two open spaces where we can provide demos, answer questions, or show the new features in Wing IDE 4.0. Hope to see you there!
Eli Bendersky
Non-constant global initialization in C and C++
Consider this code:
int init_func()
{
return 42;
}
int global_var = init_func();
int main()
{
return global_var;
}
Is it valid C? Is it valid C++?
Curiously, the answer to the first question is no, and to the second question is yes. This can be easily checked with a compiler:
$ gcc -Wall -pedantic global_init.c
global_init.c:7: error: initializer element is not constant
$ g++ -Wall -pedantic global_init.c
$ a.out; echo $?
42
The C standard prohibits initialization of global objects with non-constant values. Section 6.7.8 of the C99 standard states:
All the expressions in an initializer for an object that has static storage duration shall be constant expressions or string literals.
What is an object with static storage duration? This is defined in section 6.2.4:
An object whose identifier is declared with external or internal linkage, or with the storage-class specifier static has static storage duration. Its lifetime is the entire execution of the program and its stored value is initialized only once, prior to program startup.
C++ is a different story, however. In C++ much more is being determined at runtime before the user’s main function runs. This is in order to allow proper construction of global and static objects (C++ objects may have user-defined constructors, which isn’t true for C).
Peeking at the disassembled code produced by the g++ for our code snippet, we see some interesting symbols, among them __do_global_ctors_aux and _Z41__static_initialization_and_destruction_0ii, both executed before our main.
In particular, _Z41__static_initialization_and_destruction_0ii does the actual initialization of global_var. Here are the relevant lines:
40055d: callq 400528 <_Z9init_funcv>
400562: mov %eax,2098308(%rip) # 6009ec <global_var>
init_func is called (its name is distorted due to C++ name mangling), and then its return value (which is in eax) is assigned to global_var.
Related posts:
- Initialization of structures and arrays in C++ Suppose you have a large array or structure containing important...
- Variable initialization in C++ There are many ways to initialize a variable in C++....
- Array initialization with enum indices in C but not C++ Suppose you have the following scenario: some function you’re writing...
March 07, 2011
PyCon US
PyCon 2011: Program Guide on iOS and Android Devices
To install, follow the following link: Conventionist - Get It! or search the App Store or Android Market for 'conventionist' from Proxima Labs. Once the application is installed, run it and select 'Download Guides'. Look for and select the "PyCon US '11" guide.
The entire schedule, including tutorials, with detailed information is available, as well as information on all our sponsors and exhibitors. Maps of the conference area, exhibitors room, and poster session are included. You can create a personal schedules with reminders naively; this is not connected to the personal schedule feature on our website.
Very special thanks to Jeff Lewis, Peter Lada, and the entire Proxima Labs team for providing such a fantastic service!
Mike C. Fletcher
PyPy hits 3x speed (or 1/12th, or 2.5x depending on the sign-post)
Another day playing with PyPy. First up was a pleasant surprise in that the 2x slowdown reduced with the currently nightly build[1], bringing performance up from 40,000cps to ~60,000cps.
After that, applied a micro-optimization that got me about a 6% speedup; I eliminated the use of a state "struct object" (object that just used regular attribute access and no methods) in favour of passing all of the state as explicit arguments and returning the state modification(s) to the caller. Not a huge win, but what it did do is make it possible for cfbolz to point out that, with the modifications, there was an unneeded re-raising of an exception in one of the most heavily used methods.
Eliminating just that trivial operation caused performance to triple (from ~60,000cps to a somewhat respectable 180,000cps). Baseline for the naive rewrite in cPython was 82,000cps, so we're suddenly seeing a real performance improvement.
Second fix cfbolz proposed was a bit more... evil... basically replace a for-loop with a recursive call, which caused performance to jump to 265,000cps on pypy, but caused it to drop noticeably on cpython. A little bit of code to test for pypy before using the pypy hack eliminates that issue.
End of the day, the code-base is markedly faster on pypy, but has also been optimized (slightly) for cpython as well. cpython parses at 110,000cps, pypy at 265,000cps on the test file. That puts us at ~1/12th the speed of the optimized C for pypy and ~1/30th for cpython.
There are still lots of things I want to explore, but I don't have the time to work on them today, so I suppose that will be next week...
[1] at the time I'd thought it had been entirely reduced, but that was because I ran the cPython test in the wrong window (duh!)
Greg Turnquist
Runnable code fragments are important
As I work through the rewrites for chapter 3, I am really thankful that I focused on making every block of code runnable when I first wrote these chapters. Maybe that sounds strange, but it isn't that uncommon to be writing a recipe and want to go back and add a step that was missed. You write some extra code in the draft and then move on.
Sometimes it can be very tempting to just put that extra bit in the draft, especially when working towards a deadline. But I paid extra special attention to capturing changes in code, even starting a new file if it was an alteration to code already run. So now I'm just having to include the steps to creating the files which I forgot to include the first time around. This has given me extra confidence in the quality of the code, and I haven't received any comments about bugs yet. Yeah!
This makes me feel good that my readers just have to copy down the code and they will be able to run things just like me. In a software book, being able to run the code is vital!
Calvin Spealman
How To Attend Pycon 2011
Short announcement:
PyCon US
PyCon 2011: Live on Startup Row
It is worth quoting just a little from the original post introducing Startup Row: "carview.php?tsp=""Since the beginning, Python has always been strongly associated with startups and entrepreneurs.... For Startup Row, we wanted to look toward the future - companies that are just starting today, but may become household names in the future."carview.php?tsp="" The founders of these companies will be at PyCon for the mail conference days, and for one day they will be participating in the Expo Hall. The other days they will be participating at PyCon with everyone else, so look around - the person next to you may have just started a company. So without further ado, here are the fifteen Startup Row Finalists:Friday
Saturday
Python Software Foundation
Call for submissions for promotional brochure
A new PSF project aims to create professional quality promotional material about Python. The first goal is to create a brochure to showcase the many ways Python is used. It will include use cases to highlight the ways the language allows users to accomplish their tasks both in educational and in professional settings.
Project team members Marc-André Lemburg, Jan Ulrich Hasecke, and Armin Stross-Radschinski created this Plone marketing brochure for the German Zope User Group. It is the inspiration for this new project.
Community feedback and awareness is vitally important for the success of this initiative, mainly to gather information to be used in the brochure. We are especially looking for interesting projects that can be discussed as use-cases.
If you have any suggestions for information to include in the brochure, please contact Marc-André Lemburg or send an email to brochure AT getpython DOT info.
Eli Bendersky
From C to AST and back to C with pycparser
Ever since I first released pycparser, people were asking me if it’s possible to generate C code back from the ASTs it creates. My answer was always – "sure, it was done by other users and doesn’t sound very difficult".
But recently I thought, why not add an example to pycparser’s distribution showing how one could go about it. So this is exactly what I did, and such an example (examples/c-to-c.py) is part of pycparser version 2.03 which was released today.
Dumping C back from pycparser ASTs turned out to be not too difficult, but not as trivial as I initially imagined. Some particular points of interest I ran into:
- I couldn’t use the generic node visitor distributed with pycparser, because I needed to accumulate generated strings from a node’s children.
- C types were, as usual, a problem. This led to an interesting application of non-trivial recursive AST visiting. To properly print out types, I had to accumulate pointer, array and function modifiers (see the _generate_type method for more details) while traversing down the tree, using this information in the innermost nodes.
- C statements are also problematic, because some expressions can be both parts of other expressions and statements on their own right. This makes it a bit tricky to decide when to add semicolons after expressions.
- ASTs encode operator precedence implicitly (i.e. there’s no need for it). But how do I print it back into C? Just parenthesizing both sides of each operator quickly gets ugly. So the code uses some heuristics to not parenthesize some nodes that surely have precedence higher than all binary operators. a = b + (c * k) definitely looks better than a = (b) + ((c) * (k)), though both would parse back into the same AST. This applies not only to operators but also to things like structure references. *foo->bar and (*foo)->bar mean different things to a C compiler, and c-to-c.py knows to parenthesize the left-side only when necessary.
Here’s a sample function before being parsed into an AST:
const Entry* HashFind(const Hash* hash, const char* key)
{
unsigned int index = hash_func(key, hash->table_size);
Node* temp = hash->heads[index];
while (temp != NULL)
{
if (!strcmp(key, temp->entry->key))
return temp->entry;
temp = temp->next;
}
return NULL;
}
And here it is when dumped back from a parsed AST by c-to-c.py:
const Entry *HashFind(const Hash *hash, const char *key)
{
int unsigned index = hash_func(key, hash->table_size);
Node *temp = hash->heads[index];
while (temp != NULL)
{
if (!strcmp(key, temp->entry->key))
return temp->entry;
temp = temp->next;
}
return NULL;
}
Indentation and whitespace aside, it looks almost exactly the same. Note the curiosity on the declaration of index. In C you can specify several type names before a variable (such as unsigned int or long long int), but c-to-c.py has no idea in what order to print them back. The order itself doesn’t really matter to a C compiler – unsigned int and int unsigned are exactly the same in its eyes. unsigned int is just a convention used by most programmers.
A final word: since this is just an example, I didn’t invest too much into the validation of c-to-c.py – it’s considered "alpha" quality at best. If you find any bugs, please open an issue and I’ll have it fixed.
Related posts:
- Implementing cdecl with pycparser cdecl is a tool for decoding C type declarations. It...
- pycparser now supports C99 Today I released pycparser version 2.00, with support for C99...
- SICP section 5.3 I liked the way the authors used vectors to simply...
Python 4 Kids
Time for Some Introspection
Did you notice some errors in the previous tutorial? One was fatal. The fact that no one commented on them indicates to me that no one is actually typing in the code – naughty naughty! Type it in. It’s important.
The errors have been corrected now, but they were:
pickle.dump(fileObject, triviaQuestions)
(the order of the arguments is wrong, the object to dump goes first, and the file object to dump it into goes next); and
there was a stray full stop at the end of one line.
If you typed in the previous tutorial you should have received the following error:
>>> pickle.dump(fileObject,triviaQuestions) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python2.6/pickle.py", line 1362, in dump Pickler(file, protocol).dump(obj) File "/usr/lib64/python2.6/pickle.py", line 203, in __init__ self.write = file.write AttributeError: 'list' object has no attribute 'write'
Or something like it – the exact error may be different depending on what version of python you are running.
If you receive an error like this you can always use the interpreter’s built in help function to assist:
>>> help(pickle.dump) Help on function dump in module pickle: dump(obj, file, protocol=None)
This is not entirely enlightening, but it does tell you that the order of the arguments – the object first, followed by the file second, followed by a third, optional, argument (protocol). We know it is optional because it is assigned a default value.
The object itself is also able to tell you about itself. This is called “introspection”. In English introspection means looking inward. People who are introspective spend time thinking about themselves. In Python, introspection is the ability of the program to examine, or give information about, itself. For example, try this:
>>> print pickle.__doc__ Create portable serialized representations of Python objects.
See module cPickle for a (much) faster implementation. See module copy_reg for a mechanism for registering custom picklers. See module pickletools source for extensive comments.
Classes:
Pickler Unpickler
Functions:
dump(object, file) dumps(object) -> string load(file) -> object loads(string) -> object
Misc variables:
__version__ format_version compatible_formats
This shows the “docstring” for the pickle module. Docstring is a string which holds documentation about the object. We have learnt from the docstring that pickle has methods for dumping object to strings as well as files. Any object can have a docstring, for example, our triviaQuestions list had one [if you redo the previous tute to reconstruct it, since we haven't instantiated it this time]:
>>> triviaQuestions.__doc__ "list() -> new empty list\nlist(iterable) -> new list initialized from iterable's items"
In this case, the docstring is the same for all lists (try [].__doc__). However, some objects, particularly classes (which we haven’t met yet) and functions, are able to have their own docstrings which are particular to that object. A docstring can be created for an object by adding a comment in triple single quotes (”’) at the start of the object’s definition (other comment forms like single quotes work, but triple single quotes are the convention so that you can include apostrophes etc in the docstring):
>>> def square(x): ... '''Given a number x return the square of x (ie x times x)''' ... return x*x ... >>> square(2) 4 >>> square.__doc__ 'Given a number x return the square of x (ie x times x)'
When you write code you should also write docstrings which explain what the code does. While you may think you’ll remember what it does in the future, the reality is that you won’t!
How did I know that pickle had it’s own docstring? Well, I read it somewhere, like you read it here. However, if you ever find yourself needing to work out what forms part of an object Python has a function to do it – it’s called dir(). You can use it on any object. Let’s have a look at it on the square() function we just made up:
>>> dir(square)
['__call__', '__class__', '__closure__', '__code__', '__defaults__', '__delattr__', '__dict__', '__doc__', '__format__', '__get__', '__getattribute__', '__globals__', '__hash__', '__init__', '__module__', '__name__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'func_closure', 'func_code', 'func_defaults', 'func_dict', 'func_doc', 'func_globals', 'func_name']
I bet you didn’t realise that the function we just defined now had so many attributes/methods!! You can see that __doc__ is one of them. Where an attribute starts with two underscores ‘__’ it’s got a special meaning in Python. You can pronounce the two underscores in a number of different ways including: “underscore underscore”, “under under”, “double underscore”, “double under” and, my favourite, “dunder”.
To tell whether these are methods (think functions) rather than attributes (think values) you can use the callable function:
>>> callable(square.__repr__) True >>> callable(square.__doc__) False
If it is callable, then you can add parentheses to it and treat it like a function (sometimes you will need to know what arguments the callable takes):
>>> square.__repr__() '<function square at 0x7f0b977fab90>'
The __repr__ method of an object gives a printable version of the object.
When something goes wrong with your program you can use Python’s introspection capabilities to get more information about what might have gone wrong and why. Also, don’t forget to check the Python docs!
Homework:
- go over previous tutes and identify 3 objects
- for each of these objects:
- re-do the relevant tute to instantiate (ie create) each of these objects;
- look at the docstring for the object (print objectName.__doc__); and
- look at the directory listing for the object (print dir(objectName)).
- Extra marks:
- find some callable methods in one listing and call them.
Invent with Python
New Game Source Code: Squirrel Eat Squirrel
Made a new game with Pygame. It’s called “Squirrel Eat Squirrel”, where you move your squirrel around the screen eating the smaller squirrels and avoiding the larger ones. The more squirrels you eat, the larger you grow. This is a Python 3 game, but I think it’s compatible with Python 2. You need Pygame installed as well.
Use the arrow keys to move around. You can be hit three times before you die.
Try modifying the constant variables at the top of the file to change around the game. (Squirrel speeds, number of squirrels, amount of health, etc.) This isn’t part of my Code Comments tutorials, since I haven’t had time to go through and add detailed comments to the code (but it’s still commented.)
Python 4 Kids
A Big Jar of Pickles
In the last tutorial we learned how to pickle our objects. Pickling is a way of storing the object (on the computer’s file system) so that it can be used later. This means that if we want to re use an object we can simply save it and load it when we need it, rather than re-creating it each time we want to use it. This is very useful when our object is a list of questions for our trivia game. We really only want to type the questions in once and then reload them later.
Now we need to settle on a way to structure our data. We saw in our earlier tutorial that each question was a list, and that the list itself had a certain structure. We also need to think about how a number of questions will be stored. We will use a list to do that as well! In this case we will have a list of questions. Each of the elements in the list will itself be a list. Let’s build one. First we make an empty list to store all the questions:
triviaQuestions=[]
It is empty:
len(triviaQuestions)
Next, let’s make a sample question to add to that list. Feel free to use your own question/ answers if you want to use your own topic:
sampleQuestion = []
Now, we populate the sample question:
sampleQuestion.append("Who expects the Spanish Inquisition?")
# first entry must be the question
sampleQuestion.append("Nobody")
# second entry must be the correct answer
sampleQuestion.append("Eric the Hallibut")
sampleQuestion.append("An unladen swallow")
sampleQuestion.append("Brian")
sampleQuestion.append("Me!")
# any number of incorrect answers can follow
# but they must all be incorrect
There are 6 elements in the sampleQuestion list:
len(sampleQuestion)
Now, we add the sample question (as the first entry) to the list of trivia questions:
triviaQuestions.append(sampleQuestion)
It now has one question in it:
len(triviaQuestions)
To add more questions we “rinse and repeat”:
sampleQuestion = []
# this clears the earlier entries
# if we append without doing this
# we'll have multiple questions in the wrong list
sampleQuestion.append("What is the air-speed velocity of an unladen swallow?")
sampleQuestion.append("What do you mean? African or European swallow?")
sampleQuestion.append("10 m/s")
sampleQuestion.append("14.4 m/s")
sampleQuestion.append("23.6 m/s")
triviaQuestions.append(sampleQuestion)
Now, the sampleQuestion has five entries and there are two questions in total:
len(sampleQuestion) len(triviaQuestions)
Now we need to save the question list so we can use it again later. We will save it to a file called “p4kTriviaQuestions.txt”. Ideally we would test to see whether this file already exists before first creating it (so that we don’t inadvertently wipe some valuable file). Today however, we’re just crossing our fingers and hoping that you don’t already have a file of this name in your directory:
import pickle fileName = "p4kTriviaQuestions.txt" fileObject = open(fileName,'w') pickle.dump(triviaQuestions,fileObject) # oops! earlier draft had these in the wrong order! fileObject.close()
So far we have spent a lot of time on how to store the data used by the game. However, in order to hang the various parts of the trivia game together we need to learn about storing a different part of the game – the program itself. We will be looking at that in the coming tutorials.
Richard Jones
PyWeek 12 (April 2011) is registration is open!
The 12th Python Game Programming Challenge (PyWeek) is almost upon us. It'll run from the 3rd to the 10th of April. Registration for teams and individuals is now open on the website.
The PyWeek challenge:
- Invites entrants to write a game in one week from scratch either as an individual or in a team,
- Is intended to be challenging and fun,
- Will hopefully increase the public body of game tools, code and expertise,
- Will let a lot of people actually finish a game, and
- May inspire new projects (with ready made teams!)
If you've never written a game before and would like to try things out then perhaps you could try either:
- The tutorial I presented at LCA 2010, Introduction to Game Programming, or
- The book Invent Your Own Computer Games With Python
March 06, 2011
Invent with Python
New Extra Game: Connect Four clone
I have a text version of a Connect Four clone done. The AI for it looks ahead two moves, which makes it fairly impossible to beat unless you concentrate. I was planning to use this game for a chapter on recursion in my next book, but decided to publish the code for the text-version now.
Download fourinarow.py (This is for Python 3, not Python 2)
The code has few comments, but looking at its source code might be a good exercise for someone learning to program. It’s available on the book’s website in the Extra section.
Nick Coghlan
What is a Python script?
This is an adaptation of a lightning talk I gave at PyconAU 2010, after realising a lot of the people there had no idea about the way CPython's concept of what could be executed had expanded over the years since version 2.4 was released. As of Python 2.7, there are actually 4 things that the reference interpreter will accept as a main module.
Ordinary scripts: the classic main module identified by filesystem path, available for as long as Python has been around. Can be executed without naming the interpreter through the use of file associations (Windows) or shebang lines (pretty much everywhere else).
Module name: By using the -m switch, a user can tell the interpreter to locate the main module based on its position in the module hierarchy rather than by its location on the filesystem. This has been supported for top level modules since Python 2.4, and for all modules since Python 2.5 (via PEP 338). Correctly handles explicit relative imports since Python 2.6 (via PEP 366 and the __package__ attribute). The classic example of this usage is the practice of invoking "python -m timeit 'snippet'" when discussing the relative performance of various Python expressions and statements.
Valid sys.path entry: If a valid sys.path entry (e.g. the name of a directory or a zipfile) is passed as the script argument, CPython will automatically insert that location at the beginning of sys.path, then use the module name execution mechanism to look for a __main__ module with the updated sys.path. Supported since Python 2.6, this system allows quick and easy bundling of a script with its dependencies for internal distribution within a company or organisation (external distribution should still use proper packaging and installer development practices). When using zipfiles, you can even add a shebang line to the zip header or use a file association for a custom extension like .pyz and the interpreter will still process the file correctly.
Package name: If a package name is passed as the value for the -m switch, the Python interpreter will reinterpret the command as referring to a __main__ submodule within that package. This version of the feature was added in Python 2.7, after some users objected to the removal in Python 2.6 of the original (broken) code that incorrectly allowed a package's __init__.py to be executed as the main module. Starting in Python 3.2, CPython's own test suite supports this feature, allowing it to be executed as "python -m test".
The above functionality is exposed via the runpy module, as runpy.run_module() and runpy.run_path().
If anyone ever sees me (metaphorically) jumping up and down about making sure things get mentioned in the What's New document for a new Python version, this is why. Python 2.6 was released in October 2008, but we didn't get the note about the zipfile and directory execution trick into the What's New until February 2010. It is described in the documentation, but really, who reads the command line documentation, or is likely to be casually browsing the runpy docs? This post turning up on Planet Python will probably do more to get the word out about the functionality than anything we've done before now :)
March 05, 2011
Roberto Alsina
De Vicenzo: A much cooler mini web browser.
It seems it was only a few days ago that I started this project. Oh, wait, yes, it was just a few days ago!
If you don't want to read that again, the idea is to see just how much code is needed to turn Qt's WebKit engine into a fully-fledged browser.
To do that, I set myself a completely arbitrary limit: 128 lines of code.
So, as of now, I declare it feature-complete.
The new features are:
- Tabbed browsing (you can add/remove tabs)
- Bookmarks (you can add/remove them, and choose them from a drop-down menu)
This is what already worked:
- Zoom in (Ctrl++)
- Zoom out (Ctrl+-)
- Reset Zoom (Ctrl+=)
- Find (Ctrl+F)
- Hide find (Esc)
- Buttons for back/forward and reload
- URL entry that matches the page + autocomplete from history + smart entry (adds https://, that kind of thing)
- Plugins support (including flash)
- The window title shows the page title (without browser advertising ;-)
- Progress bar for page loading
- Statusbar that shows hovered links URL
- Takes a URL on the command line, or opens https://python.org
- Multiplatform (works in any place QtWebKit works)
So... how much code was needed for this? 87 LINES OF CODE
Or if you want the PEP8-compliant version, 115 LINES OF CODE.
Before anyone says it: yes, I know the rendering engine and the toolkit are huge. What I wrote is just the chrome around them, just like Arora, Rekonq, Galeon, Epiphany and a bunch of others do.
It's simple, minimalistic chrome, but it works pretty good, IMVHO.
Here it is in (buggy) action:
It's more or less feature-complete for what I expected to be achievable, but it still needs some fixes.
You can see the code at it's own home page: https://devicenzo.googlecode.com
PyCharm
PyCharm 1.2 Release Candidate; Execute selection in console
As Django 1.3 is almost ready for release, so is PyCharm 1.2. The Release Candidate build, in addition to a bunch of improvements for Django support and general bugfixes, includes a new feature: an action in the context menu to execute the selected code fragment in a Python console. It uses a running console if one exists, or starts a new one if one is not running.
As usual, the PyCharm 1.2 Release Candidate download and the Release Notes are available on the PyCharm EAP page.
Nick Coghlan
Justifying Python language changes
A few years back, I chipped in on python-dev with a review of syntax change proposals that had made it into the language over the years. With Python 3.3 development starting and the language moratorium being lifted, I thought it would be a good time to tidy that up and republish it as a blog post.
Generally speaking, syntactic sugar (or new builtins) need to take a construct in idiomatic Python that is fairly obvious to an experienced Python user and make it obvious to even new users, or else take an idiom that is easy to get wrong when writing (or miss when reading) and make it trivial to use correctly.
Providing significant performance improvements (usually in the form of reduced memory usage or increased speed) also counts heavily in favour of new constructs.
I strongly suggest browsing through past PEPs (both accepted and rejected ones) before proposing syntax changes, but here are some examples of syntactic sugar proposals that were accepted.
List/set/dict comprehensions
(and the reduction builtins any(), all(), min(), max(), sum())
target = [op(x) for x in source]instead of:
target = []The transformation (`op(x)`) is far more prominent in the comprehension version, as is the fact that all the loop does is produce a new list. I include the various reduction builtins here, since they serve exactly the same purpose of taking an idiomatic looping construct and turning it into a single expression.
for x in source:
target.append(op(x))
Generator expressions
total = sum(x*x for x in source)instead of:
def _g(source):or:
for x in source:
yield x*x
total = sum(_g(x))
total = sum([x*x for x in source])Here, the GE version has obvious readability gains over the generator function version (as with comprehensions, it brings the operation being applied to each element front and centre instead of burying it in the middle of the code, as well as allowing reduction operations like sum() to retain their prominence), but doesn't actually improve readability significantly over the second LC-based version. The gain over the latter, of course, is that the GE based version needs a lot less memory than the LC version, and, as it consumes the source data
incrementally, can work on source iterators of arbitrary (even infinite) length, and can also cope with source iterators with large time gaps between items (e.g. reading from a socket) as each item will be returned as it becomes available (obviously, the latter two features aren't useful when used in conjunction with reduction operations like sum, but they can be helpful in other contexts).
With statements
with lock:instead of:
# perform synchronised operations
lock.acquire()This change was a gain for both readability and writability - there were plenty of ways to get this kind of code wrong (e.g. leave out the try-finally altogether, acquire the resource inside the try block instead of before it, call the wrong method or spell the variable name wrong when attempting to release the resource in the finally block), and it wasn't easy to audit because the resource acquisition and release could be separated by an arbitrary number of lines of code. By combining all of that into a single line of code at the beginning of the block, the with statement eliminated a lot of those issues, making the code much easier to write correctly in the first place, and also easier to audit for correctness later (just make sure the code is using the correct context manager for the task at hand).
try:
# perform synchronised operations
finally:
lock.release()
Function decorators
@classmethodinstead of:
def f(cls):
# Method body
def f(cls):Easier to write (function name only written once instead of three times), and easier to read (decorator names up top with the function signature instead of buried after the function body). Some folks still dislike the use of the @ symbol, but compared to the drawbacks of the old approach, the dedicated function decorator syntax is a huge improvement.
# Method body
f = classmethod(f)
Conditional expressions
x = A if C else Binstead of:
x = C and A or BThe addition of conditional expressions arguably wasn't a particularly big win for readability, but it was a big win for correctness. The and/or based workaround for the lack of a true conditional expression was not only hard to read if you weren't already familiar with the construct, but using it was also a potential source of bugs if A could ever be False while C was True (in such cases, B would be returned from the expression instead of A).
Except clause
except Exception as ex:instead of:
except Exception, ex:Another example of changing the syntax to reduce the potential for non-obvious bugs (in this case, except clauses like `except TypeError, AttributeError:`, that would actually never catch AttributeError, and would locally do AttributeError=TypeError if a TypeError was caught).
- Other Python Planets
- Python Libraries
- Python/Web Planets
- Other Languages
- Databases
- Subscriptions
- [OPML feed]
- 2degrees
- A Little Bit of Python
- Aahz
- Adam Pletcher
- Adrián Deccico
- Alex Clark
- Alex Gaynor
- Alexander Artemenko
- Alexander Limi
- Alexandre Bourget
- Alexandre Vassalotti
- Ali Afshar
- Anastasios Hatzis
- Anatoly Techtonik
- Andre Roberge
- Andrew Dalke
- Andriy Drozdyuk
- Andy Dustman
- Andy Todd
- Anna Martelli Ravenscroft
- Anthony Baxter
- Anton Belyaev
- Anton Bobrov
- Armin Ronacher
- Arthur Koziel
- Arun Ravindran
- Atamert Olcgen
- Atul Varma -- Toolness
- Baiju M.
- Barbara Shaurette
- Barry Warsaw
- Ben Bangert
- Ben Bass
- Benjamin Peterson
- Benjamin W. Smith
- Benji York
- Bertrand Mathieu
- Bill Mill
- BioPython News
- Bit of Cheese
- Björn Tillenius
- Blended Technologies
- BlueBream
- BlueDynamics Alliance
- Brandon Rhodes
- Brett Cannon
- Brian Curtin
- Brian Harring
- Brian Jones
- Brian Jones
- Bryce Verdier
- Calvin Spealman
- Carl Trachte
- Carlos Eduardo de Paula
- Carlos de la Guardia
- Casey Duncan
- Catherine Devlin
- Chad Cooper
- Chad Whitacre
- Checking and Sharing
- Chris Hager
- Chris Leary
- Chris McAvoy
- Chris McDonough
- Chris Miles
- Chris Miller
- Chris Perkins
- Christian Heimes
- Christian Joergensen
- Christian Scholz
- Christopher Denter
- Christopher Lenz
- Christopher Perkins
- Chuck Thier
- Chui Tey
- CodeSnipers
- Collin Winter
- Corey Goldberg
- Cosmic Seriosity Balance
- Creative Commons
- Cross-Platform Command Line Tools
- Dalius Dobravolskas
- Dallas Fort Worth Pythoneers
- Daniel Greenfeld
- Daniel Nouri
- Dariusz Suchojad
- Dave Beazley
- David Ascher
- David Goodger
- David Grant
- David J C Beach
- David Janes
- David Malcolm
- David Stanek
- Davy Mitchell
- Davy Wybiral
- Dazza
- DeGizmo
- Denis Kurov
- Derrick Petzold
- Dethe Elza
- Diane Trout
- Django Weblog
- Djangocon
- Djangofeed
- Doug Hellmann
- Dougal Matthews
- Douglas Napoleone
- Duncan McGreggor
- EasyGUI
- Ed Crewe
- Ed Taekema
- Edward K. Ream
- Eli Bendersky
- Enthought
- Eric Florenzano
- Eric Holscher
- Eric Wu's Pythonfilter
- EuroPython
- Evan Fosmark
- Fabio Zadrozny
- Fazal Majid
- Flavio Coelho
- Floris Bruynooghe
- Francisco Souza
- Frank Wierzbicki
- Fredrik Lundh
- Gael Varoquaux
- Gary Bernhardt
- Gary Wilson
- Geek Scrap
- Geert Vanderkelen
- Gene Campbell
- Georg Brandl
- Gilad Raphaelli
- Glenn Franxman
- Glyph Lefkowitz
- Go Deh
- Godson Gera
- Gonçalo Margalho
- Graham Dumpleton
- Gramps
- Grant Baillie
- Greg Taylor
- Greg Turnquist
- Greg Turnquist
- Greg Wilson
- Gregory Pinero
- Grig Gheorghiu
- Guido van Rossum
- Gustavo Narea
- Gustavo Niemeyer
- Guyon Moree
- Gökhan Sever
- Hanno Schlichting
- Hans Nowak
- Hany Fahim
- Harrison Erd
- Hector Garcia
- Heikki Toivonen
- Hilary Mason
- Holger Krekel
- Huy Nguyen
- IPython0 blog
- Ian Bicking
- Ian Ozsvald
- Imaginary Landscape
- Invent with Python
- Ionel Cristian
- Iraj Jelodari
- IronPython-URLs
- Ishan Chattopadhyaya
- Isotoma
- Ivan Krstic
- Ivan Krstic
- Jack Diederich
- Jacob Perkins
- Jaime Buelta
- James Polera
- James Tauber
- Jared Forsyth
- Jarrod Millman
- Jason Baker
- Jason Hildebrand
- Jean-Paul Calderone
- Jeet Sukumaran
- Jeethu Rao
- Jeff McNeil
- Jeff Rush
- Jeff Shell
- Jeff Winkler
- Jeremy Hylton
- Jesse Noller
- Jim Baker
- Jim Hughes
- Johan Dahlin
- John Anderson
- John Burns
- John Cook
- John Paulett
- Jon Parise
- Jonathan Ellis
- Jonathan Hartley
- Jonathan LaCour
- Jonathan LaCour
- Juan Rivas
- Julien Anguenot
- Juri Pakaste
- Just a little Python
- Kai Diefenbach
- Kamon Ayeva
- Katie Cunningham
- Kay Hayen
- Kay Schluehr
- Kelly Yancey
- Kenneth Reitz
- Kenneth Reitz
- Kevin Dahlhausen
- Kevin Dangoor
- Kiwi PyCon
- Konrad Delong
- Kristján Valur
- Krys Wilken
- Kulbir Saini
- Kumar McMillan
- Kun Xi
- Kun Xi
- Laurent Luce
- Laurent Szyster
- Lawrence Oluyede
- Lee Braiden
- Leigh Honeywell
- Lennart Regebro
- Lesscode.org
- Lightning Fast CMS
- Lightning Fast Shop
- Lion Kimbro
- Lionel Tan
- Logilab
- Low Kian Seong
- Ludovico Fischer
- Ludvig Ericson
- Luke Plant
- Malthe Borch
- Malthe Borch
- Marcus Whybrow
- Mario Boikov
- Marius Gedminas
- Mark Dennehy
- Mark Dufour
- Mark McMahon
- Mark Mruss
- Mark Nottingham
- Mark Paschal
- Mark Ramm
- Marko Samastur
- Martijn Faassen
- Martin Blais
- Mathieu Fenniak
- Matt Goodall
- Matt Harrison
- Matt Wilkes
- Matthew Rollings
- Matthew Scott
- Matthew Scott
- Matthew Wilson
- Mattias Brändström
- Max Ischenko
- Max Khesin
- Menno's Musings
- Michael Bayer
- Michael Crute
- Michael Droettboom
- Michael Foord
- Michael Hudson
- Michael J.T. O'Kelly
- Michael Schurter
- Michael Sparks
- Michael Watkins
- Michal Kwiatkowski
- Michele Simionato
- Mike C. Fletcher
- Mike Driscoll
- Mike Naberezny
- Mike Pirnat
- Mikeal Rogers
- Mikko Ohtamaa
- Mitch Chapman
- Mitchell Garnaat
- Montreal Python User Group
- Muharem Hrnjadovic
- Nadav Samet
- Nadia Alramli
- Ned Batchelder
- Neil Schemenauer
- Ng Pheng Siong
- Nicholas Piël
- Nick Coghlan
- Nick Efford
- Nicolas Dumazet
- Nicolas Evrard
- Noah Gift
- Nuxeo
- Ondřej Čertík
- Orbited
- Orestis Markou
- Partecs
- Patrice Neff
- Patricio Paez
- Patrick Stinson
- Paul Bissex
- Paul Everitt
- Paul Harrison
- Paulo Nuin
- Pete Hunt
- Peter Bengtsson
- Peter Eisentraut
- Peter Fein
- Peter Halliday
- Peter Harkins
- Peter Hoffmann
- Peter Parente
- Petro Verkhogliad
- Phil Hassey
- Philip Jenvey
- Philipp von Weitershausen
- Philippe Normand
- Phillip J. Eby
- Phillip Pearson - Second p0st
- Pietro Abate
- Pradeep Gowda
- PyAMF Blog
- PyCharm
- PyCon 2008 on YouTube
- PyCon Australia
- PyCon Podcast
- PyCon US
- PyPy Development
- Pylons News Feed
- Python 4 Kids
- Python 411 Podcast
- Python Advocacy
- Python African Tour
- Python News
- Python Open Mike
- Python Software Foundation
- Python Sprints
- Python User Groups
- PythonThreads
- Pythonology
- Péter Szabó
- Ram Rachum
- Randell Benavidez
- Reinout van Rees
- Rene Dudfield
- Richard Jones
- Richard Tew
- Rob Golding
- Rob Miller
- Robert Brewer
- Robert Collins
- Roberto Alsina
- Robin Dunn
- Robin Parmar
- Roche Compaan
- Rocky Burt
- Rodrigo Araúj
- Rok Garbas
- Ruslan Spivak
- Ryan Cox
- Ryan Tomayko
- S. Lott
- SDJournal
- SPE Weblog
- Salman Haq
- Sandro Tosi
- Scripting the web with Python
- Sean McGrath
- Sean Reifschneider
- Senthil Kumaran
- Shannon -jj Behrens
- Shriphani Palakodety
- Simon Brunning
- Simon Willison
- Simon Wittber
- SnapLogic
- Speno's Pythonic Avocado
- SpringSource
- Stephen Ferg
- Steve Holden
- Steven Klass
- Steven Kryskalla
- Steven Wilcox
- Stijn Debrouwere
- Swaroop C H
- Sylvain Hellegouarch
- Tarek Ziade
- Ted Leung
- Ted Nyman
- Teemu Harju
- Tennessee Leeuwenburg
- Terry Jones
- Terry Peppers
- The Artificial Intelligence Cookbook
- The Occasional Occurrence
- The Python Papers
- Thomas Guest
- Thomas Vander Stichele
- Tim Golden
- Tim Knapp
- Tim Lesher
- Tim Parkin
- Titus Brown
- Tobias Ivarsson
- Tom Christie
- Toshio Kuratomi
- Troy Melhase
- Tryton News
- Turnkey Linux
- Uche Ogbuji
- V.S. Babu
- Vern Ceder
- Vinay Sajip
- Virgil Dupras
- Viva La Chipperfish
- Washington Times OpenSource
- Wayne Witzel
- Wes Mason
- Will Kahn-Greene
- Will McGugan
- Will Pierce
- William Reade
- William's Journal
- Wingware
- Wolfram Kriesing
- Wyatt Baldwin
- Xavier Spriet
- Yaniv Aknin
- Yannick Gingras
- Zachary Voase
- Zeth
- codeboje
- eGenix.com
- nl-project
- planet.python.org updates
- wiredobjects
- To request addition or removal:
e-mail planet at python.org (note, responses can take up to a few days)






