| CARVIEW |
Planet Python
Last update: September 07, 2008 08:41 AM
September 07, 2008
Frank Wierzbicki
djangocon day 1
The first day of the inaugural Djangocon took place at the Googleplex today. Since I wasn't madly writing a presentation at this one (as I have been for every conference I've been at in recent memory), I actually took some notes.
Some highlights:
Leslie Hawthorn (of Google Summer of Code fame if you don't recognize the name) is here at Djangocon and was apparently a big part of making it happen. As always she was cranked up to 11. It's hard to believe just how much stuff has her involvment. Perhaps there is more than one LH?
Guido van Rossum talked about Google App Engine. I am *really* going to need to take it for a spin in my copious spare time. He hinted that more languages beyond Python are being audited for use in App Engine, but there was no promise of a time frame (apparently even the audit of Python -- probably one of the best known languages at Google -- took forever). He didn't say which one. I will place my bets at 60% on Ruby and 40% on Java -- though I wonder what MVC framework would make sense from Java...
Adrian Holovaty reminisced about some of the bad times with early Django code, but also how much of the original concepts remain almost unchanged. He then went on to describe the current state of Django. My favorite bit was this quote from the tracker around August 2005:
I can't think of any other backwards-incompatible changes that we're planning before 1.0 (knock on wood). If this isn't the last one, though, it's at least the last major one.Jacob Kaplan Moss continued on Django history and the current cool features of modern Django, also with some pretty funny anecdotes. He also plugged Jython on Django :). Which by the way was very much a two way effort. The Django folks where incredibly supportive of our efforts to get Django working with Jython.
Justin Bronn talked about GeoDjango and gave a fascinating talk that started with the fundamentals of GIS and then went into how to use GeoDjango to map-enable your code. It relies on ctypes, which may not be too hard to get running on Jython. GeoDjango demoes very well so that's one more reason for us to take a look at getting ctypes supported.
Carl Henderson was hilarious in his "Why I hate Django". It was a combination of interesting survey of what makes a site like Flickr scale so well and a sort of loving roast of Django. There's no point in trying to convey it -- I think they recoreded all of these talks if you are a fan of Django I'm certain you would enjoy his talk.
I also released Jython 2.5 in the middle of all of this, met Leo Soto (our Django google summer of code student), and talked over some co-routine issues with Jim Baker. All in all a very good day.
September 06, 2008
Phillip J. Eby
Python Gets Out...
Python seems to keep turning up in the most unusual places. Today I went to the library and borrowed a couple of books on graphic design to assist in making some layout decisions for the book I'm working on. One was a book I'd read before, Editing By Design, which I'd used to help with the design of my earlier book, "You, Version 2.0". The other was a book called (appropriately enough) "The Layout Book". I was skimming through it, when I came across a page with this near the bottom (I've elided a few items from the middle):
"Simple is better than complicated.
Quiet is better than loud.
Unobtrusive is better than exciting.
Small is better than large.
...
The obvious is better than that which must be sought.
Few elements are better than many.
A system is better than single elements."
The block of text was a quote, attributed to one Dieter Rams. "Wow," I thought, "I wonder if Tim Peters's Zen of Python was a play off of this..."
Then I turn the page.
At the very top of a collection of "methodologies", I see:
"Python philosophy
Derived from computer programming, the main points of the Python approach were presented by developer Tim Peters in The Zen of Python. Key points include: beautiful is better than ugly, explicit is better than implicit..."
Small world, eh?
--PJ
P.S. I'm still amused by the mentions of Python in Charles Stross's science fiction novels, especially the one where the future hero is described as doing his game programming work in Python 3000, almost as if it were some highly-futuristic language. ;-)
P.P.S. In case you hadn't guessed, the reason I'm not doing more programming (or blogging about programming) right now is because I'm working on the book... in which, incidentally, I'm attempting to take a truly algorithmic approach -- not to mention a highly test-driven one -- to such diverse matters as motivation, belief, creativity, time management, and even optimism.
Frank Wierzbicki
Jython 2.5 Alpha 2 Released!
On behalf of the Jython development team, I'm pleased to announce that
Jython 2.5a2+ is available for download. See the installation instructions.
Django runs pretty well on this release. I am attending Djangocon where Jim Baker and Leo Soto will be presenting on Django on Jython, and I wanted them to be able to tell people to grab a release instead of telling them to grab Jython from svn.
There are many bug fixes, but also many bugs that have not yet been fixed. This is an alpha release so be careful!
Will Kahn-Greene
Comments are working again....
Back in June I must have:
- disabled or uninstalled suexec
- upgraded apache without installing a new suexec
- done a debian upgrade that did something
Anyhow, comments are working again.
Long strings in Python
In Miro, we've got long strings that are displayed in the user interface. I think the code that defines these strings is messy and hard to parse. For example:
def some_func():
description = _("""\
This is a really long description that has multiple sentences and a few \
things that need to %(getfilledin)s and it goes on and on and on and on \
and I'm not really sure what's the best way to format it so that it's happy \
in editors and easier to parse.""") % {"getfilledin": blahblah}
PEP-8 doesn't address this, which is fine. I was curious to see what other projects do.
Stamp values
The United States Post Office increases the cost of postage periodically. They do faster than I use the stamps that I've bought. In the process of this, I ended up with first class stamps that have no value listed on them, so I had no idea how much they were worth.
Buried on the USPS web-site is this quick service guide that lists the values of all stamps that have no value listed on them. Hooray for search engines!
Greg Wilson
What’s Your Favorite Online Survey Engine?
We’re running a survey this fall of how scientists use computers, and while SurveyMonkey does most of what we want, it doesn’t do everything. We would welcome pointers to other online survey engines that:
- Are very reliable.
- Provide a wide variety of tools (including “Rank the following in order of importance” with an option to mark some “not applicable”).
- Don’t fill 80% of the screen with banner ads.
Anyone have experiences to share?
Beginning Python for Bioinformatics
Python, overepresented motifs, the Grand Finale
In this final part, let’s do some very simple refactoring and modify the output section to make the result a little bit better. There are not many options about the functions to calculate the binomial expansion. But Andrew posted some opinions on how to slight change the quorum function.
def get_quorums(seqs, mlen):
"""
add seq id_no to a set
use explicit counter to create seq_no
"""
quorum = defaultdict(int)
for seq in seqs:
for n in range(len(seq) - mlen):
quorum[seq[n:n + mlen]] += 1
return quorum
His modifications were small but improved the code a bit, as you remove one variable/object from the function. At the same time there is need to change a bit our output section of the code, as we don’t use a defaultdict initialized with a set, but with an integer.
for i in foreground:
term1 = choose(background[i], foreground[i])
term2 = choose((N - background[i]), len(input_seqs)-1)
term3 = choose(N, len(input_seqs))
p = (float(term1) * float(term2)) / term3
if 0 < p <= 0.0001:
print i, foreground[i], background[i], p
Notice that in the term1 line we don’t check for the set lenght anymore and just use the interger stored in foreground and background. Again a small change, that can make the code a little bit more clear. But we need to modify this section so the output is a little bit more clear, maybe ordered by motif sequence.
But as we are reading the sequences as they are our results are not ordered. It would be great to have a final list starting with AAAAAAAA and ending with TTTTTTTTT. There is an easy way to do that, and very inexpensive regarding code and final performance. Basically we append each one of the motifs (and their extra information) to a list and use the sort method for lists. So our output section of the code will be
res_motifs = []
for i in foreground:
term1 = choose(len(background[i]), len(foreground[i]))
term2 = choose((N - len(background[i])), len(input_seqs)-1)
term3 = choose(N, len(input_seqs))
p = (float(term1) * float(term2)) / term3
if 0 < p <= 0.0001:
res_motifs.append(i + '\t' + str(foreground[i]) + '\t' + str(background[i]) + '\t' + str(p))
res_motifs.sort()
for i in res_motifs:
print i
Putting everything together our final motif determination script is (batteries included):
#!/usr/bin/env python
import fasta
import sys
from collections import defaultdict
def choose(n, k):
if 0 <= k <= n:
ntok = 1
ktok = 1
for t in xrange(1, min(k, n - k) + 1):
ntok *= n
ktok *= t
n -= 1
return ntok // ktok
else:
return 0
def get_quorums(seqs, mlen):
"""
add seq id_no to a set
use explicit counter to create seq_no
"""
quorum = defaultdict(int)
for seq in seqs:
for n in range(len(seq) - mlen):
quorum[seq[n:n + mlen]] += 1
return quorum
input_seqs = fasta.read_seqs(open(sys.argv[1]).readlines())
input_seqs2 = fasta.read_seqs(open(sys.argv[2]).readlines())
foreground = get_quorums(input_seqs, 10)
background = get_quorums(input_seqs2, 10)
N = len(input_seqs) + len(input_seqs2)
res_motifs = []
for i in foreground:
term1 = choose(len(background[i]), len(foreground[i]))
term2 = choose((N - len(background[i])), len(input_seqs)-1)
term3 = choose(N, len(input_seqs))
p = (float(term1) * float(term2)) / term3
if 0 < p <= 0.0001:
res_motifs.append(i + '\t' + str(foreground[i]) + '\t' + str(background[i]) + '\t' + str(p))
res_motifs.sort()
for i in res_motifs:
print i
Next we will see some basic Python methods. And maybe start a new series and phase.
September 05, 2008
Mike C. Fletcher
Want that tool, but cores are useless without debug symbols
Tried to get the gdb-with-python recipe (that we discussed at PyGTA) to investigate the crashes from the rewrite. I gather it requires that you have debugging symbols built for Python and all of the modules for it (well, to be able to trace into them, anyway). I'm on an Ubuntu box here, not really sure how to create the debugging symbols without needing to rebuild from source. I'm guessing that there's something like Gentoo's "splitdebug" available in some package if I knew where to look.
I've got my machine at home (Gentoo) rebuilding Python, pygame, wxPython, numpy, etceteras with debugging symbols (turns out Gentoo auto-strips (duh!) even though I have -g specified in the make options, so no debug symbols there at the moment). I want to eventually get the pyglet-generator-based branch fixed up, and that will require the debugging symbols when I get around to it.
Not much else done today other than to get all of the Demos/tests in PyOpenGL, OpenGLContext and PyOpenGL-Demo working under Ubuntu Hardy. Wanted to do Win32 as well, but ran out of time. Especially need to check/fix extensions as we have a bug reported for them.
Sigh, too far, too fast
Have had to shelve the changes for PyOpenGL to use the pyglet generator for the core code. Wound up getting extremely hard-to-debug errors which appeared to be due to corruption of object's GC structures (possibly due to GIL-holding issues) when I switched to a Linux 32-bit platform (instead of 64-bit).
I don't want to hold up the 3.0 release trying to debug them (given that I'm going to be very busy for a month or more I likely won't have much PyOpenGL time). Seems that we'll have to wait for post 3.0 to use the Pyglet generator.
If someone wants a little side-project to figure out what I did wrong, they can check out the pyglet_generator_tests branch of the OpenGL-ctypes code-base.
Corey Goldberg
Performance Testing - Load Balancer SSL Stress Testing
I am in the process of stress testing a new load balancer (F5 Big-IP 9.x). One concern with the device is its ability to handle concurrent SSL connections and terminations, and what the impact is on performance and resources. The nature off SSL makes it very CPU intensive (encryption). This is basically a torture test of HTTP GETs using SSL. What I need is a program that can keep HTTP requests active at line-speed so there are always concurrent requests being served.
The goal I am looking to reach is 50k concurrent active SSL connections going through the load balancer to a web farm.
First off, I need a load generator program. If I were to use a commercial load testing tool (LoadRunner, SilkPerformer, etc), I would most likely be unable to do a test like this due to Virtual User licensing costs. So the best way for me to achieve this is with a custom homebrewed tool. So of course I reached for Python.
The load generating machines I was given have Dual 2GHz CPUs,2GB memory, running Windows XP. The real challenge is seeing how many concurrent requests I can send from a Windows box!
First I started with a Python script that launches threads to do the HTTP (SSL) requests.
The script looks like this:
#!/usr/bin/env python
# ssl_load.py - Corey Goldberg - 2008
import httplib
from threading import Thread
threads = 250
host = '192.168.1.14'
file = '/foo.html'
def main():
for i in range(threads):
agent = Agent()
agent.start()
class Agent(Thread):
def __init__(self):
Thread.__init__(self)
def run(self):
while True:
conn = httplib.HTTPSConnection(host)
conn.request('GET', file)
resp = conn.getresponse()
if __name__ == '__main__':
main()
This works great for small workloads, but Windows stops me from creating threads at around 800 per process (sometimes it was flaky and crashed after 300), so that won't work. The next step was to break the load into several processes and launch threads from each process. Rather than try to wrap this into a single script, I just use a starter script to call many instances of my load test script. I used a starter script like this:#!/usr/bin/env python
import subprocess
processes = 60
for i in range(processes):
subprocess.Popen('python ssl_load.py')
250 threads per process seemed about right and didn't crash my system. Using this, I was able to launch 60 processes which gave me a total of 15k threads running.I didn't reach the 50k goal, but on that hardware I doubt anyone could. With my 15k threads per machine, I was able to use 4 load generating machines to achieve 60k concurrent SSL connections. Not bad!
Ned Batchelder's blog
OpenID is too hard
OpenID is one of those web technologies I would love to love: it addresses a need, seems pretty well thought-out, and all the cool kids are doing it. But the fact is, it's still a bit too hard for what it's trying to be. When I first heard about OpenID, I read about it, and didn't quite get it. People kept talking about it, so I kept going back to read about it, and it still mystified me.
Big players started adopting it (AOL, Yahoo), so it seemed like it was here to stay, but I still didn't have the incentive to get over the learning curve. Earlier this week I visited yet another site that encouraged me to get an OpenID, and I decided I would finally cross OpenID off my list of technologies I should at least understand and probably use.
The simplest way to use OpenID is to pick a provider like Yahoo, go to their OpenID page, and enable your Yahoo account to be an OpenID. This in itself was a little complicated, because when I was done, I got to a page that showed me my "OpenID identifiers", which had one item in it:
https://me.yahoo.com/a/.DuSz_IEq5Vw5NZLAHUFHWEKLSfQnRFuebro-
What!? What is that, what do I do with it? Am I supposed to paste that into OpenID fields on other sites? Are you kidding me? Also, in the text on that page is a stern warning:
This step is completely optional. After you choose an identifier, you cannot edit or delete it.
(Emphasis theirs). So now I have a mystifying string of junk, with a big warning all over it that I can't go back. "This step" claims it's optional, but I seem to have already done it! Now I'm afraid, and I'm a technical person — you expect my wife to do this?
Luckily I can choose to enable other identifiers, so I also enable my flickr account as an OpenID.
Since I am a technical person, I've learned that OpenID supports delegation. That's a way to have your website act as an OpenID simply by adding some HTML to your page. The HTML points to another OpenID behind the scenes. That way, I can use nedbatchelder.com as my OpenID, and later be able to change who is actually hosting my OpenID.
Simon Willison shows the simple way to delegate your OpenID on your home page. You need the id you just got from your provider, and you need a URL for the provider's server. Oh, bad news: Yahoo won't say what their server's URL is. I can't delegate to Yahoo. Why? Don't know. Time to get another provider.
So I go to a more savvy provider, get an ID and a delegate server URL, edit my page, and I can't log in to my desired site. I must have messed something up. A good debugging tool for this is to log in to jyte.com. Since it was built by JanRain, the company behind a lot of OpenID, they helpfully provide very geeky error messages if the OpenID login fails for some reason. Turns out I had omitted one place in the HTML that I had to put my user id. Once I fixed that, all was well.
But what have I really gained? Ted Dziuba exuberantly rants about OpenID, since it is why he hates the Internet, and his points are accurate: OpenID is still really difficult, and doesn't gain you that much.
Stefan Brands rounds up lots of issues with OpenID, and I think they need to be taken seriously. OpenID may be one of those Internet technologies that will be fabulous among the savvy and well-intentioned, but falters when spread to the wider population on the web.
Swaroop C H, The Dreamer » Python
Book updated for Python 3.0
After a gap of 3.5 years, I’ve finally updated the ‘A Byte of Python’ book.
The interesting news is that it is updated for the upcoming Python 3.0 language making it probably the first book to be released for Python 3.0.
The book is now a wiki too at www.swaroopch.com/notes/Python which means you can contribute too!
The book and wiki are now under the Creative Commons Attribution-Share Alike 3.0 Unported license. The Non-Commercial clause present in the previous edition of the book has been removed. It was becoming a hurdle for translators as well as people who wanted to use the book for genuinely good activities, so I decided it to drop the clause.
Since it is a wiki, volunteers can directly create their translations on the wiki. This eliminates the need to learn DocBook XML and its tools which had become a hindrance for many translators, and I’m glad to see this already bearing fruit with Eirik Vågeskar starting off a Norwegian translation at www.swaroopch.com/notes/Python_nb-no:Forord.
I will soon be making a printed version of the book available as I have had many requests for this.
So back to the main question: Why an update after nearly 4 years? Two reasons.
First, because of foss.in. I dedicate this new release to the foss.in community for their spirit and enthusiasm over the years which have rubbed off on me and kept me working on the update of the book.
Second, Over the past few years, the readers’ reactions have been simply splendid:
Neil (bigdealneil-at-yahoo-dot-com) said:
“(I) got an if else to work and I can follow your tutorial, which I have never been able to do no matter who wrote the thing! you’re a genius Swaroop!”
Gao shuai (ejwjvh-at-126-dot-com) took the effort taken to write an email to me in English:
dear swaroop: I am a chinese student.My name is gao shuai,”gao”is my family name. Although your book is easy to understand,but my english is bad,so what I read is the chinese edition. I have made some programs now.It is interesting.I like it very much.
I emailed back and he replied:
Mr Swaroop: I am exciting to read your back. _(Editor’s note: I think he means ‘reply’)_ Tt is the first time that I talk to foreigner though the internet. I saw that you have your own mail ab.I think You’re a great man. Thanks for your back!(*^_^*) regards, gaoshuai
The interesting part is that this student somewhere in China was being benefited by this book and he “talked to a foreigner through the internet for the first time” and that person was me. It was truly humbling.
People are even putting ads for it, and I had no clue about it until I chanced upon it myself:
If that wasn’t enough, I found out that there are 8-9 university courses officially using the book, including Harvard and other institutions. And apparently even NASA is using the book in their Jet Propulsion Laboratory.
Users have suggested that it should replace the official tutorial but I really wouldn’t go as far as that
Recently, I had sent a sneak peek for the book’s group of readers and within a day, I had the first 10$ donation by Horst JENS. I remembered seeing that name somewhere, so I searched my emails and found this:
On Mar 4, 2007:
“Hello Swaroop, i teach a class of (3) Children how to program in Python. Just want to thank you because without your ‘a byte of python’ (that i read one year ago) i would maybe never have begun to code in python and consequently would never leaved my old job to become a Python teacher.”
A person in Vienna, Austria changed his career from a sys-admin job which he didn’t like, to teaching children about programming, a job he loves. Wow! Again, this is so humbling. I could have never imagined that a small book can make such a difference.
The point is that I’m grateful for all these people writing to me and sharing their delight and stories. The book is still alive and kicking thanks to all these people.
Happy programming!
© Swaroop for Swaroop C H, The Dreamer, 2008. |
Permalink |
2 comments |
Add to
del.icio.us
Post tags:
PyPy Development
Düsseldorf PyPy sprint 5-13th October, 2008
The PyPy team is happy to announce the next sprint, which will take place in the Computer Science Department of the University of Düsseldorf, Germany. Sprinting will start on the 6th of October and go on till the 12th. Please arrive on the day before if you want to come.
Topics of the sprint will be aiming at a 1.1 release and to work on integrating PyPy better with small devices. Other topics are also welcome!
We will try to find a hotel with group rates, so if you are interested, please sign up soon! See the announcement for more details.
Ned Batchelder's blog
Caches aplenty
My laptop has a 100Gb drive, and recently it was 98% or so full! As part of the job of cleaning it up, I used SpaceMonger to see where it the space was going. I noticed a few largish directories whose names indicated they were caches of some sort, and wondered how much disk was being lost to copies of files that I didn't really need to keep around.
I cobbled together this Python script to recursively list the size of folders and files, but only if they exceed specified minimums:
""" List file sizes recursively, but only if they exceed
certain minimums.
"""
import stat, os
# Minimum size for a file or directory to be listed.
min_file = 10000000
min_dir = 1000000
format = "%15d %s"
dir_format = "%15d / %s"
err_format = " !!! ! %s"
def do_dir(d):
""" Process a single directory, return its total size,
and print intermediate results along the way.
"""
try:
files = os.listdir(d)
except KeyboardInterrupt:
raise
except Exception, e:
print err_format % str(e)
return 0
files.sort()
total = 0
for f in files:
f = os.path.join(d, f)
st = os.stat(f)
size = st[stat.ST_SIZE]
is_dir = stat.S_ISDIR(st[stat.ST_MODE])
if is_dir:
size = do_dir(f)
else:
if size >= min_file:
print format % (size, f)
total += size
if total >= min_dir:
print dir_format % (total, d)
return total
if __name__ == '__main__':
do_dir(".")
Running this on my disk, and grep'ing for "cache", I came up with this list of cache directories:
77428736 / .\Documents and Settings\All Users\Application Data\Apple\Installer Cache
193088296 / .\Documents and Settings\All Users\Application Data\Apple Computer\Installer Cache
127431856 / .\Documents and Settings\All Users\Application Data\Symantec\Cached Installs
1283586 / .\Documents and Settings\All Users\DRM\Cache
8904444 / .\Documents and Settings\batcheln\Application Data\Adobe\CameraRaw\Cache
3109555 / .\Documents and Settings\batcheln\Application Data\Dropbox\cache
9141658 / .\Documents and Settings\batcheln\Application Data\Microsoft\CryptnetUrlCache
6639905 / .\Documents and Settings\batcheln\Application Data\Sun\Java\Deployment\cache
244047364 / .\Documents and Settings\batcheln\Local Settings\Application Data\Adobe\CameraRaw\Cache
35706839 / .\Documents and Settings\batcheln\Local Settings\Application Data\Mozilla\Firefox\Profiles\0ou4abpz.default\Cache
1559441 / .\Documents and Settings\batcheln\Local Settings\Application Data\johnsadventures.com\Background Switcher\FolderQuarterScreenCache
381984768 .\Documents and Settings\batcheln\My Documents\My Pictures\Lightroom\Lightroom Catalog Previews.lrdata\thumbnail-cache.db
44671279 / .\Program Files\Adobe\Adobe Help Center\AdobeHelpData\Cache
1093120 / .\Program Files\Common Files\Microsoft Shared\SFPCA Cache
1139888470 / .\Program Files\Cyan Worlds\Myst Uru Complete Chronicles\sfx\streamingCache
73237698 / .\Program Files\Hewlett-Packard\PC COE 3\OV CMS\Lib\Cache
46559334 / .\WINDOWS\assembly\GAC
20606686 / .\WINDOWS\assembly\GAC_32
55143608 / .\WINDOWS\assembly\GAC_MSIL
105975390 / .\WINDOWS\Driver Cache
96353450 / .\WINDOWS\Installer\$PatchCache$
1898024 / .\WINDOWS\SchCache
1174871 / .\WINDOWS\pchealth\helpctr\OfflineCache
451465998 / .\WINDOWS\system32\dllcache
(I also included the GAC directories: .net Global Assembly Caches). Summing these sizes, I see that 3 Gb or so of space is occupied by self-declared caches. For many of these I don't know whether it is safe to delete them. Luckily the largest was a game I installed for Max and could completely uninstall.
Windows provides the Disk Cleanup utility, which knows how to get rid of a bunch of stuff you don't really need. Application developers can even write a handler to clean up their own unneeded files, but it seems application developers don't, as I don't have any custom handlers on my machine.
CCleaner is a Windows utility to scrub a little harder at your disk, but even it missed some of these folders: for example, it removed the smaller of the CameraRaw caches (8 Mb), but left the larger (244 Mb). I read online that CameraRaw really doesn't need those files, so I removed them by hand.
I'm all for applications making use of disk space to improve the user experience, but they should do it responsibly: give me a way to see what's being used, and give me a way to get it back. And only keep what makes sense: why do my Apple Installer Cache directories have kits for three different versions each of iTunes, QuickTime, and Safari, and seven kits for Apple Mobile Device Support? Why do I need to keep installers for versions that have already been superceded?
September 04, 2008
IronPython-URLs
IronPython at PyWorks and PyCon UK Conferences
The PyWorks 2008 conference schedule is now up. This is a new conference, held in Atlanta November 12th-14th, by the same people who bring us the Python Magazine. I'll be speaking at PyWorks on IronPython: Python on .NET and in your Browser.
- Windows Forms
- Databases
- Web services and network access
- Handling XML
- Threading
Arthur Koziel
Automatical superuser creation with Django
I delete and sync my database fairly often during development with Django because the "syncdb" command will not alter the table in the database after, for example, adding a new field to the corresponding model.
The problem I have with this is typing in the same data for a superuser over and over again. It's a very repetitive task, so I was grateful when I heard this tip from my co-worker Sebastian today.
Superuser from fixture
We're going to automatically load the superuser from a fixture. To do this, dump the data of the auth module into a fixture called "initial_data.json":
./manage.py dumpdata --indent=2 auth > initial_data.json
You'll see that along the superuser that you've already created during the usual "syncdb" execution, a few other credentials got dumped. Since we only need the data for the superuser, delete the irrelevant stuff. The file should look like this:
[
{
"pk": 1,
"model": "auth.user",
"fields": {
"username": "arthur",
"first_name": "",
"last_name": "",
"is_active": true,
"is_superuser": true,
"is_staff": true,
"last_login": "2008-09-04 14:25:29",
"groups": [],
"user_permissions": [],
"password": "sha1$fooobar123",
"email": "arthur@arthurkoziel.com",
"date_joined": "2008-09-04 14:25:29"
}
}
]
The fixture called "initial_data.json" will automatically get loaded by Django every time you execute the "syncdb" command.
Delete your database and try to run the "syncdb" command with the "--noinput" option passed (it will prevent the script to go into interactive mode):
./manage.py syncdb --noinput
There shouldn't be a prompt for a superuser and you should see a message at the end of the output indicating that your fixture was loaded.
Admin login
Not having to create a superuser is great, but if you're working a lot with Django's contrib.admin application, you'll need to log-in again every time you sync the database and load the user fixture. Another repetitive task that can be eliminated:
After logging in into the admin backend, dump the data of the "session" table into stdout:
./manage.py dumpdata --indent=2 sessions
Copy the dictionary containing the session for your superuser and append it to the list in "inital_data.json" like this:
[
{
"pk": 1,
"model": "auth.user",
"fields": {
"username": "arthur",
"first_name": "",
"last_name": "",
"is_active": true,
"is_superuser": true,
"is_staff": true,
"last_login": "2008-09-04 14:25:29",
"groups": [],
"user_permissions": [],
"password": "sha1$foobarbarfoo",
"email": "arthur@arthurkoziel.com",
"date_joined": "2008-09-04 14:25:29"
}
},
{
"pk": "9aadfe1de61b0937fasd684221f03",
"model": "sessions.session",
"fields": {
"expire_date": "2008-10-20 14:34:59",
"session_data": "foobar123"
}
}
]
You might want to increase the "expire_date" a little bit, so that your session won't expire.
Now everytime you delete and sync your database (remember to pass "--noinput"), Django will automatically load the superuser and it's associated session from the fixture. You won't have to manually type in the data for the user and log-in into the backend everytime anymore.
Tarek Ziade
Yet another Planet
Atomisator is a framework so it is hard to get an idea of its features until a real application uses it.
That is why I wrote a small application in Pylons called Yap (Yet Another Planet), that is basically displaying the XML file produced by an Atomisator instance. Since Atomisator does all the work, the Pylons apps is really small (one or two controllers, that’s it).
My first use case was to produce a nice, smart Planet for our user group Afpy.
Here’s a first draft: https://ziade.org/afpy/
You can play with ‘j’, ‘k’ and arrows to open and close posts, but I am still working on this, so it will also scrolling the window when you are on a post.
Anyways, it grabs various French sources for Python and uses these plugins from Atomisator:
- filter : reddit
- filter : delicious
- filter : doublons
- enhancer : related
- enhancer : digg
The result is basically following reddit and delicious links to display an extract of the page linked, and display digg comments as well. Duplicate are removed as well. A list of related entry are also added to each entry.
It is based on this configuration file, Atomisator uses to generate an XML file for Yap in a cron:
[atomisator] sources = rss https://del.icio.us/rss/tag/python+fr Delicious rss https://www.afpy.org/search_rss?portal_type=AFPYNews&sort_on=Date&sort_order=reverse&review_state=published Afpy News rss https://feeds.feedburner.com/Baderlog/python Bader rss https://www.biologeek.com/journal/rss.php?cat=Python Biologeek rss https://www.gawel.org/weblog/rss/python/afpy/zope/zope3/rss.xml Gawel rss https://www.haypocalc.com/blog/rss.php?cat=Python Haypo rss https://jehaisleprintemps.net/blog/rss/ No rss https://programmation-python.org/sections/blog/exportrss Tarek rss https://api.blogmarks.net/rss/tag/python,fr Blogmarks # put here the database location database = sqlite:///afpy.db # this is the file that will be generated file = /home/tarek/www/packages/Yap/trunk/yap/public/afpy.xml # infos that will appear in the generated feed. title = Planet Python Francophone description = Le planet de l'Association Python Francophone, et des gens heureux. link = https://www.afpy.org/planet/ filters = reddit delicious doublons enhancers = related digg
What’s Next ?
Since now, there were no attempt to try to automatically classify entries. The next plugin I am working on will provide a Naive Bayesian filter to classify entries, together with a way to train it through the Yap web interface. basically a ‘keep’/'ditch’ button.
I will also set an english Planet Python to see how things go with more sources.

Greg Wilson
Science 2.0: the Future of Online Tools for Scientists
A pub night and panel with Timo Hannay, Cameron Neylon, and Michael Nielsen, hosted by Nature Network Toronto
What does the future hold for the way we do science? Are online repositories such as GenBank and the physics preprint ArXiv, or social tools such as Nature Network, about to change science profoundly? To find out, join Nature Network Toronto for an interactive panel discussion over drinks at the pub.
Date: Sunday September 7 at 7:30pm
Place: Fionn MacCool’s (181 University Avenue, near corner with Adelaide)
About the panelists:
Timo Hannay is Publishing Director of Nature.com at the Nature Publishing Group, publishers of Nature and over seventy other scientific journals, plus numerous online resources for scientists. He is responsible for new online initiatives in social software, databases and audio-visual content. Timo trained as a neurophysiologist at the University of Oxford and worked as a journalist and a management consultant before becoming a publisher.
Cameron Neylon is a biophysicist working in molecular biology, biophysics, and high throughput methods. He has a joint appointment as a Lecturer in Combinatorial Chemistry at the University of Southampton and as a Senior Scientist in Biomolecular Sciences at the ISIS Pulsed Neutron and Muon Facility. He is developing an electronic notebook for biochemistry labs which has lead to his involvement in the Open Research movement and to his group moving to an Open Notebook.
Michael Nielsen is a writer living just outside Toronto, Canada. He is currently working on a book about The Future of Science. One of the pioneers of quantum computation, he coauthored the standard text on quantum computation that is the most highly cited physics publication of the last 25 years. He is the author of more than fifty scientific papers, including invited contributions to Nature and Scientific American.
For more information visit Nature Network Toronto (https://network.nature.com/group/toronto), or contact Eva Amsen (eva.amsen@gmail.com) or Jen Dodd (jen@jendodd.com, 519 572 2275).
Benjamin Peterson
Fun with 2to3
Recently, I've been working on 2to3 code refactoring tool. It's quite exciting really; how often does automatic editing of source code work?
2to3 is completely written in Python using the stdlib. The main steps in code translation are:
- 2to3 is given a Python source file and a list of transformations (in units called fixers) to apply to it.
- 2to3 generates a custom parse tree of the source based on a Python grammar that combines elements of 2.x and 3.x's syntax. It takes note of exact indentation and comment so it will be reproduced exactly later.
- Each fixer has a pattern that describes the nodes it wants to match in the parse tree. The tree is traversed while asking each fixer if it matches the given node.
- If the fixer's pattern matches the node, the fixer is ask to transform the code. The fixer can manipulate the parse tree directly.
- A diff against the original source is printed to stdout and optionally written back to the file.
Over the past few weeks, I've written a couple of fixers. It's pretty intuitive once you get the hang of it, but writing good tests is very important because Python's flexible syntax produces many possibilities you fixer must deal with. I also refactored lib2to3, so plugging a different system of fixers is much easier for custom applications. I've also written some documentation on it's usage. I hope to start documenting the API and writing a guide for creating fixers soon, so other people can start making use of lib2to3.
Jesse Noller
Benjamin Peterson: Testing the CPython Core
See this "Google Open Source Blog"-post about "Testing the CPython Core".
If you don't know - Ben has been helping us deliver the python 2.6 and 3.0 betas (and RCs) all summer long. He's personally helped me with a lot of stuff around the multiprocessing package and the like. He has really contributed to the releases directly and helped out a lot.
So, thanks Ben!
The Python Papers
Volume 3 issue 2 is released
Hi everyone
After a long wait of nearly 5 month, we are back in business to bring the latest edition of The Python Papers - Volume 3 Issue 2 ( https://ojs.pythonpapers.org/index.php/tpp/issue/current ).
From this issue onwards, we will be having only 3 issues per year instead of 4. This is in compliance with our ISSN registration.
What's new
=========
1. We have expanded our editorial team with 2 new Associate Editors, Sarah Mount (from UK) and Guy Kloss from (New Zealand).
2. TPP is now managed using Open Journal System and it can be assessed at https://ojs.pythonpapers.org/tpp
3. Backporting of previous issues of TPP from Volume 1 Issue 1 is complete
4. We had "soft-launched" TWO new periodicals - The Python Papers Monographs (for monograph-length submissions which may include dissertations, conference proceedings, case studies and advanced-level lectures) and The Python Papers Source Codes (modeled after ACM Collected Algorithms and provides a collection of software and source codes, usually associated with papers published in The Python Papers and The Python Papers Monograph). They shall be TPPM and TPPSC respectively.
5. Collectively, TPP, TPPM and TPPSC will be umbrella-ed as The Python Papers Anthology (TPPA) and managed under the same editorial committee.
6. Probably the most important development to TPP is that TPP is currently indexed by a number of services, including Google Scholar and OAIster, as a result of using Open Journal System.
So, please enjoy our latest edition and we look towards all of your continued support and contributions.
Thank you.
Cheers
Maurice Ling
Co-Editor-in-Chief, The Python Papers Anthology
Small Values of Cool
Django 1.0
Congratulations to everyone involved in getting Django 1.0 out of the door. Great job.
Apress eBooks
Buy a book from Apress (such as the superb Definitive Guide to Django) and you can get the PDF version for a fiver, so you can keep a copy on your laptop when you are on the road.
What a great idea. I don't want only the PDF - I much prefer to read the dead tree edition. But you can't carry dozens of chunky tech books around with you, so I want the PDF too. Clearly I shouldn't have to pay full whack twice, but the fiver that Apress is asking seems fair.
I wish O'Reilly did this.
Jesse Noller
Stirred up dem bees: Should BSDDB be removed from Python?
This week, we've seen a push dev-wise to get RC1 completed and ready to go - I've spent some time giving multiprocessing some love (still not done) and a lot of other people have been working around the clock to close out the large number of release blockers.
As of last night though, the trigger was pulled on removing bsddb (the berkley DB python module) from the standard library in the 3.0 timeline (2.6 adds deprecation warnings).
Now, before anyone thinks this is an arbitrary decision, here's the argument (in a nutshell):
- bsddb has always been painful to maintain
- Jesus Cea is the only person who has stepped up to maintain it
- bsddb is "heavy weight" - out most of the standard library, it has the most dependencies and nuances to cross platform maintenance.
- Until Jesus Cea stepped up later in the 2.6/3.0 process it was "one of those packages" that no one wanted to maintain.
- For most of 2.6 and 3.0 it's been a buildbot fail train.
See PEP 3108:
Maintenance Burden
Over the years, certain modules have become a heavy burden upon python-dev to maintain. In situations like this, it is better for the module to be given to the community to maintain to free python-dev to focus more on language support and other modules in the standard library that do not take up a undue amount of time and effort.
bsddb3
- Externally maintained at https://www.jcea.es/programacion/pybsddb.htm .
- Consistent testing instability.
- Berkeley DB follows a different release schedule than Python, leading to the bindings not necessarily being in sync with what is available.
This thread is where the hammer fell.
Now, note that Jesus Cea has done an amazing amount of work updating/upgrading the bsddb support for 2.6 and 3.0 (see his recent announcement here). I feel for him in a lot of respects: He busted his butt to fix, maintain and resolve all open issues with bsddb and the buildbots for the release, but the decision had been made back in July to remove/deprecate the bsddb package (see above).
Now, there is a lot more discussion occurring around the removal:
Edit: I finally got a free moment to do an update - in an email this afternoon on Python 3000, the BDFL (GvR) made the final decision on bsddb - it's out as of py3k:
I am still in favor of removing bsddb from Python 3.0. It depends on a
3rd party library of enormous complexity whose stability cannot always
be taken for granted. Arguments about code ownership, release cycles,
bugbot stability and more all point towards keeping it separate. I
consider it no different in nature than 3rd party UI packages (e.g.
wxPython or PyQt) or relational database bindings (e.g. the MySQL or
PostgreSQL bindings): very useful to a certain class of users, but
outside the scope of the core distribution.Python 3.0 is a perfect opportunity to say goodbye to bsddb as a
standard library component. For apps that depend on it, it is just a
download away -- deprecating in 3.0 and removal in 3.1 would actually
send the *wrong* message, since it is very much alive! I am grateful
for Jesus to have taken over maintenance, and hope that the package
blossoms in its newfound freedom.
- Python Libraries
- Python/Web Planets
- Other Languages
- Subscriptions
- [OPML feed]
- Aahz's Weblog
- Abe Fettig
- Adam Pletcher
- Aftermarket Pipes
- Alessandro Iob
- Andre Roberge
- Andrew Bennetts
- Andrew Dalke
- Andy Dustman
- Andy Todd
- Anthony Baxter
- Arthur Koziel
- Atul Varma -- Toolness
- Baiju M.
- Base-Art / Articles
- Beginning Python for Bioinformatics
- Ben Bangert
- Benjamin Peterson
- Benji York
- Brandon Rhodes
- Brett Cannon
- Brian Jones
- Calvin Spealman
- Carlos de la Guardia
- Chad Whitacre
- Chris McAvoy
- Chris McDonough
- Christopher Lenz
- Chui Tey
- Corey Goldberg
- Cosmic Seriosity Balance
- Daniel Nouri
- David Ascher
- David Goodger
- David Stanek
- Davy Wybiral
- Deadly Bloody Serious about Python
- Dethe Elza
- Doug Hellmann
- Doug Napoleone
- Dougal Matthews
- Duncan McGreggor
- Enthought
- EuroPython Conference
- Feet up! : dev/python
- Flavio Coelho
- Floris Bruynooghe
- Frank Wierzbicki
- Fredrik Lundh
- Georg Brandl
- Glenn Franxman
- Glyph Lefkowitz
- Graham Dumpleton
- Greg Wilson
- Grig Gheorghiu
- Guido van Rossum's Weblog
- Gustavo Niemeyer
- Guyon Moree
- Hans Nowak
- Heikki Toivonen
- IPython0 blog
- Ian Bicking
- IronPython-URLs
- James Tauber
- Jarrod Millman
- Jeff Rush
- Jeff Shell
- Jehiah Czebotar
- Jeremy Hylton
- Jesse Noller
- Jim Baker
- Jkx@home
- Johan Dahlin
- JotSite.com
- Julien Anguenot
- Juri Pakaste
- Kevin Dangoor
- Keyphrene Dot Com
- Krys Wilken
- Kumar McMillan
- Laurent Szyster
- Lennart Regebro
- Level++
- Life of Brian Ray - Python
- Low Kian Seong
- Mar
- Marius Gedminas
- Mark Dufour
- Mark Mruss
- Mark Nottingham
- Mathieu Fenniak's Weblog
- Matt Goodall
- Matt Harrison
- Matt Kaufman
- Matthew Wilson
- Max Ischenko' blog
- Max Khesin
- Michael Bayer
- Michael Hudson
- Michael J.T. O'Kelly
- Mike C. Fletcher
- Mike Pirnat
- Muharem Hrnjadovic
- Neal Norwitz
- Ned Batchelder's blog
- Nick Efford
- Nicolas Evrard
- Noah Gift
- Orestis Markou
- Patrick Roberts's Blog
- Patrick Stinson
- Paul Everitt
- Paul Harrison
- Peter Bengtsson
- Phil Hassey
- Philip Jenvey
- Philipp von Weitershausen
- Phillip J. Eby
- PyAMF Blog
- PyCon
- PyCon 2007 Podcast
- PyCon 2008 Podcast
- PyCon 2008 on YouTube
- PyCon UK News
- PyPy Development
- PyWorks Conference
- Python 411 Podcast
- Python Advocacy
- Python African Tour
- Python Magazine
- Python News
- Python Postings
- Python Secret Weblog
- Python Software Foundation
- Python User Groups
- Pythonology
- Rene Dudfield
- Richard Jones' Log: Python
- Robert Brewer
- Roberto De Almeida
- Robin Dunn
- Ryan Cox
- Ryan Phillips
- SPE Weblog
- Sean McGrath
- Second p0st
- ShowMeDo
- Shriphani Palakodety
- Simon Belak
- Simon Wittber
- Small Values of Cool
- SnapLogic
- Speno's Pythonic Avocado
- Spyced
- Steve Holden
- Supervisor
- Swaroop C H, The Dreamer » Python
- Tarek Ziade
- Ted Leung
- Tennessee Leeuwenburg
- Tero Kuusela
- The Law Of Unintended Consequences
- The Python Papers
- The Voidspace Techie Blog
- Thomas Vander Stichele
- Tim Golden
- Tim Parkin
- Titus Brown
- Troy Melhase
- V.S. Babu
- VirtualVitriol
- Will Kahn-Greene
- Will McGugan
- it's getting better
- ivan krstić · code culture
- planet.python.org updates
- To request addition or removal:
e-mail webmaster at python.org


![Reblog this post [with Zemanta]](https://web.archive.org/web/20080907093508im_/https://img.zemanta.com/reblog_e.png?x-id=9a6ebaca-cd31-40e8-bb9f-df57424745a9)
