CARVIEW |
Personal tools
|
europythonEuropython wednesday lightning talks
Filed Under:
Short 5 minute talks. Felix Wiemann - examples of security issues restructuredtext had a couple of nice features, like including a
file. Like The assumption was that users of the library would do something about that in their app. No. They didn't. They assumed that restructuredtext was neat and clean and secure. Conclusion: think about security issues and mention them. Also, talk to the users of your library. Also, talk to the provider of your 3th party libraries if you use them. Richard - python games
Aaron Bingham - Fixing "design by contract"See his earlier talk Andrew Maier - GangaGanga helps with sending jobs to the computing grid that CERN uses for their completely rediculously large amounts of data. The scientist can use his local machine for debugging or short tests, he can use a local batch system for small jobs and he can use the grid for the big jobs. Every one of them needs a different approach. Ganga helps them to submit their jobs in a uniform way to all three systems. It has a text interface and they even use a python syntax for
submitting the jobs Stefano Masini - Shell CSShellcs is a framework for making shells. Shells in this case is not something like bash or csh, but shells like used by for instance database managers. You can call "manager import xxx" on the commandline, but you can also just call "manager" after which you'll get an interactive prompt where "import" is one of the allowed commands. The project will go live in a few days or so on sourceforge. About the conferenceEuropython was in contact with the CERN guys and galls before europython 2005. Locations need to be booked well beforehand. There were no webserver fights as they just used the standard CERN conference system. Which worked really well. A possibility for next year is Vilnius, an alternative is Brussels, but that one is more likely for 2008. Sylvain thénault - static type inference in logilab-astng astng is a monkeypatch to the standard library's It is used for pylint (static checking of python code) and for documentation generation. The aim is to be good enough to point out possible bugs. It is not intended to be fully correct and to be used for code generation. The inference is done on a local ast (abstract syntax tree) and not on the bytecode, so you don't need to actually perform an import. Jean-Philippe Rey - teaching python at École centrale de ParisECP is a general engineering school with some 500 students per year. Just a small percentage will end up programming for a living. Some of the things you need to teach them: How to talk to and how to understand programmers. To understand how a computer works. They switched to python for their teaching: interpreted, so fast to try out. A simple syntax. Free and easy to install. Also usable by the more advanced students. Python is in use for 2 years now and it raised the students' interest in the course. Ignas Mikalajunas - Access control with CrowdsA crowd is a set of principals (a principal is a logged-in user). They use it for schooltool. A crowd is an adapter that gives back a crowd, so a list of users, that have a certain permission. Afterwards, you can test if a user is a member of a
crowd. Felix Wiemann - restructured text presentationsWith rst2s5 you can generate S5 presentations from restructured text files. Steve Alexander - LaunchpadHe thinks that launchpad is a good way of consolidating your project. It can help you manage translations by means of rosetta . It also has a bugtracker system. Support request management. Management of requested/proposed features. Etc. Other blogs
Europython: distributed companies and agility+customers
Filed Under:
Managing the launchpad team at Canonical, managing a distributed company, agile customer management. The last europython 2006 talks on the last day. Steve Alexander - Managing the launchpad team at CanonicalPeople at canonical normally work from their homes. Most communication is via phone or IRC. In total, Canonical has employees in about 20 countries. With a total timezone difference of 16 hours (launchpad), a meeting when everybody is awake is hard. Communication uses phone, irc, screen/vnc, wiki, launchpad, specifications, code review, design review, jabber, email, bug triage, activity reports, weblogs. The ubuntu team has (have?) a weekly meeting with a very strict format: they just go through everybody's progress and people get a chance to say at which they're blocked (waiting for someone else to complete something). This gives the team the change to de-block them and get them productive again. The Launchpad weekly meetings are strictly 45 minutes. They always start on time and ends on time. This is the only way to get buy-in from people who have to get up real early or have to stay up late (remember, 16 hour timezone difference!). And they have a couple of standard agenda items. At the end of the meeting people are allowed to say what to keep/bag/change. Keep short meetings, for instance. Change the standard agenda. Whatever. Items must be dealt with switfly and decisively. You either fix it now or assign it to the responsible people and to a deliverable. One of the ways to keep the IRC meeting moving is to prepare text beforehand. If you know you want to say something, make sure you can paste the sentences in one go. If you ask something ("next meeting on monday?") do a countdown 7 6 5... to force people to say it if they disagree. If you do a poll ("who upgraded to the latest version of xxxx?"), everybody has to answer done/not done. So he explicitly says "everybody answer done/not done" in irc. Every day they send a timelog to a mailinglist. Most use gtimelog for that. Q: How does it compare to an on-location team? Steve: I'd choose the on-location team all the time. It's so much easier. Q: Why did you start it as a distributed company, then? Steve: Otherwise we couldn't get the people. You just can't get them all to move to one location. They hired people that worked on debian in their free time and offered them to work on it for pay: you couldn't get them to move. Lene Wagner - Managing a distributed companyLene manages merlinux together with Holger Krekel with 8 employees spread out in Germany and Europe. There is a difference beteen a company context and an open source context. The common aspect is that a bunch of skilled people work on software. The difference is that there are more commitments: money, customers, etc. Distributed versus non-distributed company. Common: commercial activity, customers, commitments and formal requirements. Distinctions are that there's no face-to-face communication ant there's a less fixed working hours requirement and so. They asked their employees to fix a major part of their availability during the week so that they can rely on getting an amount of work done. They also have regular sync meetings on IRC (nicked from canonical). They do it in 30 minutes. They have invitations and formal minutes. Identifying blockers and dependencies is an important part. Update on product status and feedback and so. They're experimenting with one-week merlinux sprints, they did it twice till now. Of course they also participate in project-specific sprints and conferences. On the one-week-sprints they try to concentrate longer discussions about company practices and goals. Also giving feedback and having fun are parts of the sprints. Central to their company is the issue tracker. Issues are assigned to specific persons. It is for both development and non-development issues. The person responsible is really responsible: for keeping track, for a result, for triggering/involving other persons and for reporting blockers and failures. Tracking and controlling is done by means of an svn repository. Both code and company data. There are email notifications of each and every change to stay on top of all the developments. They make sure that everybody is subscribed to the commit lists that are relative to him/her. Everybody has to watch his use of communication channels. Do I say this in IRC or in an email? If I send an email, am I not spamming too many people? Email is written documentation, which gives it weight. So if you say on IRC that you'll take over something: also send an email. The formality level of the company changes per project. Some EU projects need fixed employees, other projects allow freelancers. EU projects also require more paperwork and thus more formality on the company's side. So they're agile with the formality level. Social: there is less socialising, so if someone is stuck or not motivated or so, he has to explicitly request help/feedback. There's also no discipline from a formal worktime/office structure, but this also means that people can adapt their workstyle to personal situations (bringing kids to school, for instance). Pro's of a distributed company:
Con's of a distributed company:
Aiste Kesminaite - Agile customer managementTools and processes on the customer's side will change, but people will mostly stay the same. If the people change, for instance because a central figure at your customer gets a new job, beware. When talking with a customer, try to find a way to get the rationale from the customer, the real reason why they want something. One way is to give them an activity during which you can get them to talk about what they want. For instance writing stories (from eXtreme Programming). Some customers are scared of that because they think that all their normal procedures are going to fall down around their ears. Basic rule: be there and listen to them talk. Documentation: customers don't read it, it just makes them feel better just because they have something visible on their bookshelves. It is important to become a team: you and the customer. Get your customer to trust you. It must be you and the customer versus the problem, not you versus the customer. Customers tell you to follow the plan, but what they really want is for you to follow the plan that is in their heads, not the one that's on paper. So you have to be flexible and respond to change and Issue tracking is a sensitive issue. If it works it is great. They had a problem with a customer that just couldn't put the issues in a coherent form into the issue tracker. They lumb together multiple issues instead of splitting them out. In those cases they just collect all the user emails and turn them into issues themselves. Agile mothodologies are great when they work. Don't try to apply everything all at once. Start small and teach the client along the way. Martijn Pieters: Linktally, ranking popular pages in a CMS
Filed Under:
Europython 2006 talk about a handy tool that gives you "most popular" statistics for your CMS. Updating a counter in the CMS on every page access isn't generally a good idea. CMS datasources are normally optimised for read access, so writing on every read HURTS. It doesn't scale. And it doesn't scale to clustering. The idea is to reuse the server logs (apache). Linktally doesn't deliver exact results, but it is simple, scalable, content oriented, CMS-independent. It doesn't deliver exact counts, but it provides a ranking, which one is the most popular. The architecture is that the logs are shoved through Linktally which stores it in a simple database. Linktally grabs its configuration from a plugin in the CMS, that same plugin also grabs the results from the database. What linktally grabs from the CMS is actually a list of links to look for in the logs. The CMS can query linktally for the x most popular pages. Or the x most popular news items. Or... Linktally 0.1 has just been released. This version only supports apache combined logfiles. And there's only a zope/cmf plugin at the moment. The plugin is real easy to write, so he hopes more people will add plugins. Europython: Jeroen Vloothuis - Tramline, big files are fun
Filed Under:
Europython 2006 talk about hosting big files from outside a CMS. Plone doesn't really like big files. Common solutions: put the files on the disk, either by serving them from apache directly or using some filesystem-folder product. Both aren't ideal. Tramline is put inbetween apache and plone. Tramline strips out all the file data and extracts that to disk, so plone only gets some small file. When the file is requested, tramline adds it again. Some advantages: transparent integration, you can develop and test your application without tramline, the only difference is a few extra headers. The performance is high, apache does most of the work. It is secure, as you still have to go through the entire zope/plone security infrastructure: you can't fake the URL and get the file directly from apache. The files are on disk, which makes many people happy. You have to send two extra http headers to integrate with
tramline. Tramline needs a bit of apache configuration, as it integrates through mod_python. Some benchmarks: zope managed 26 MB/sec, tramline managed 92 MB/sec, just on his local machine. Tramline is easy to set up, it integrates nicely and it scales like apache. In response to a question: it cannot handle, for instance, archetype's image scaling yet. That needs the file in the ZODB. Europython: Philipp von Weitershausen - The june releases
Filed Under:
The upcoming zope releases. In the beginning of the zope2 release cycle there were regular releases with two or three important changes each. 2.7 and 2.8 both had a lot more important changes and also took a long time. 2.7 took 1.5 year and 2.8 took 1 year. Five was especially important to drive 2.8 onwards, zope2 was a bit stalled at that time. Since december 2005 there are time-based releases: a new one every 6 months. So you're guaranteed that stable new features will be in the release within half a year. New major releases don't necessarily retire the old ones. The system seems to be working.
Plone 2.5, CPS 3.4 and Silva use big parts of zope3 already. Probably underestimated is the huge amount of possibilities allowed by WSGI. There are people trying XSLT pipelining on top of basic zope stuff, for instance. Europython: Pyphant, semantic web and JSON-RPC
Filed Under:
A workflow application for analysing scientific images, semantic web in python and a funny RPC api that uses JSON. Wednesday morning europython 2006 talks. Klaus Zimmerman - PyphantIt is a framework designed for the easy modelling of of reusable data processing and analysis workflows. One of the items of a workflow ("recipe") is a "worker", which takes input from a "socket", accepts parameters and gives back output on a "plug". In the GUI system you can then connect a number of workers with their in/output sockets. They used this for instance for pushing an image (from a microscope) through a number of filters. There are workers with multiple input sockets. If you select a worker, you get an input box where you can tweak the parameters. The gui has a list of all the available workers Plugs are the actual computation entities. They receive their input from the sockets of their worker and they cache their results in a cache-safe way. Recipes aren't really executed, but evaluated lazily. If you have a branched recipe it automatically gets executed in threads. Pyphant allows researchers to do quite a lot of tasks themselves. They plan to add more workers and also to allow loops, which then needs more controls, etc. Looks good. Marian Babik - Deep integration of python with semantic web technologiesThe semantic web is an extension of the current web, providing infrastructure for the integration of data on the web. The idea is to make the actual data available: not just binary files, but the actual data. RDF is the basis for the semantic web. Source, target and a link
between them. All three have a URI (the target can be a simple
string, though). As everything is identified by a URI (think URL),
you can merge data from different sources. If I'm identified by
RDFS allows you to define classes and subclasses. OWL builds on RDFS. RDFS can't handle everything elegantly, subclassing only gets you so far. OWL also handles equality of classes, enumerations, datatypes, etc. An important difference between an object oriented worldview and a semantic web workflow: OO has a closed world assumption, the semantic web has an open world assumption. OO assumes that it has all the info, semweb assumes that there is more information somewhere else, so it is more conservative in concluding something. Python has a lot of RDF programs. RDFLib, CMW, Pychinko, MetaLog, Sparta, Tramp. His work on Seth tries to map between python and the semantic web. OWL classes become python classes, for instance. Ontologies are turned into modules. Looks pretty nice. Jan-Klaas Kollhof - JSON-RPCWhy another RPC? He didnt' like SOAP, xml-rpc missed some things, etc: he wanted to write his own version :-) Some of the goals were: easy to understand, easy to build, buildable in javascript (so: text-based, not binary). At a high level in JSON-RPC, you have two peers who are allowed to bug eachother at any time, either with results or requests. Asynchronous. If something takes long on the server it just takes a lot of time to send back the request. If you send in an easy request in the meantime, the response might just be returned earlier than that of the time-intensive response. Basically, you encode dictionaries in JSON . A request is the name of the method, some parameters and an ID. A response is the actual result plus the ID, so that you know which request it is the response to. There's a third one: a notification for just telling the other peer something without expecting a response. Question: why positional arguments instead of named arguments? Answer: not all languages support named arguments, but he'll probably switch to named arguments altogether as they make it all much more readable. There was some discussion with the audience. Jan-Klaas dislikes doing everyting with plain http-POST. That will go through all firewalls, but it is harder to send messages in both directions, the client'll have to poll the server as the server can't really send anything itself. The alternative (that he likes himself) is to use socket connections, which is a reliable data stream and allows the server to send stuff to the client. But it'll probably die on many firewalls or proxies and you can't have too many persistent connections. Some flaws. Notifications have no error handling, but they're basically a silly idea anyway: just drop them and use normal requests. The message itself isn't JSON. There's no introspection, but that's not such a big issue as you should write applications for known interfaces, you can't really write an application on the fly at runtime :-) Some future options: named parameters, remove notifications, some standard errors, HTTP binding in the spec (POST+GET), make response/request ID optional, etc. Europython keynote: Guido van Rossum - python 3000
Filed Under:
Guido explains the changes and the goodies that will be coming in python 3000. One clarification by Guido up front: lambda is not going away! Guido expects a lot of tension between him and the rest of the developers on python 3000. There are a lot of things people want to change or add. Everyone has his own pet pieve that he wants to add or change and python 3000 brings that out. Python 3000 is the only release in which there will be incompatibilities and breakage. They won't invite breakage, but it will be allowed. Some things can't be fixed without incompatibilities and this is the time to change those. Fix early design bugs. Old style classes will be removed. Old style integer division. Stuff that has been deprecated for quite some time. python 3000 == python 3.0 == py3k There has to be some amount of process, otherwise we're lost. There are too many proposals competing for time: this won't be perl 6. When do we want py3k to be ready? How long are we going to maintain 2.x and 3.x? How do we migrate? Do we backport py3k features? We can't freeze 2.x and let it die without support. First alpha: not before 2007. Final release probably a year after that. Probably there'll be a quick 3.1 and 3.2 afterwards. 2.x and 3.x will be developed in parallel. 2.6 will come, probably before 3.0. 2.7 is likely, may contain some backports from py3k. How incompatible can it be? New keywords are allowed. Some examples of things that won't happen is that it'll stay
A completely mechanical migration of existing code won't be doable. Some things just cannot be recognised. The most likely approach will be to use a pychecker-like tool to detect some 80% of the changes and to create a version of 2.x that warns about doomed code. Some things that won't be in py3k. No programmable syntax/macros. No syntax for parallel iteration (use zip). Iterating over a dictionary gives you keys instead of key/value pairs, that will stay that way. PEP 3100 has a large list of items that are considered to be included in py3k. There will be an new standard IO stack. C stdio has too many problems. You don't know how many bytes you have buffered, for instance. The new bytes/str implementation gives an opportunity to fix all this. Print becomes a function instead of a statement. Print-as-statement is a barrier to evolution of your program and of the language (see link in PEP3100). Europython: lightning talks
Filed Under:
Lots and lots of short talks. MosheZ: Componentizing a stairway to heaven Use components a lot, they are really really cool. (Sounds like zope3
components). One of the big tricks is to always access objects
through an interface. Raviolli pattern, Risotto pattern, Lasagna pattern (layered). Spaghetti pattern. I don't know what he meant by it, but it sure sounds funny :-) Michael Hudson - pydoctor Documentation generation tool. "API docs are basically impossible for
mortals to generate", so he created something new. He didn't look at
prior art and thought it all over fresh from the start. Extracting
docstrings is the easiest part. It knows about zope.interface. It can
cope moderately with Reinout van Rees - instancemanagerInstancemanager helps you to manager your zope/plone instances. At Zest software it replaced custom made scripts that had to be adapted for every project. Some things it can do: create a zope instance, grap products and fill
the Products directory, copy a pre-made Data.fs if available,
start/stop zope, quickreinstall all products, etc. You can also do
several things with a "combining" call: Products can be extracted from .tgz, .zip or can be symlinked from svn checkouts. Bundles can also be handled. Future: two people use it on the server, so it needs a look at the safety aspect: wiping your production server isn't a good idea and instancemanager will currently happily oblige you if you ask it to do just that. Also: Jim Fulton talked about buildout today which does something similar. Arigo? - pypy thunk and sharedrefThunk is a way to do lazy evaluation. Sharedref allows python sessions to share variables: they're no copies, they're real variables. You can append something to a list in one running python and you get it updated in the other. Just a demo prototype, but much fun. Philipp von Weitershausen - zope.testbrowserSometimes you just want to treat your web app as a black box, just sending http stuff at it from the outside, behaving as a real user. Selenium is too slow and it can't be automated (with for instance buildbot). Zope.testbrowser is a programmable browser. There are three variants, one for http connections, one for talking directly to the zope3 publisher and a last one for the zope2 publisher. All variants can be recorded with the testrecorder. correction 2006-07-05: you can install the server-side testbrowser in almost every webserver, you just have to figure out how to hook it up. It is easy to install inside zope, of course. Note: you need to install on the server which you are testing. From the testrecorder, you can generate python doctests (and selenium tests, but the doctests are handier for automatic testing). Mercurial revision control tool(Talk by video) Mercurial is written in python. Sebastian Lopienski - A single word that's almost missing in the abstractsHe read all the abstracts and got some statistics out of it. A word you didn't see a lot is "security". Googling for "Python programming" versus "python security": security is 1.7% of the "programming" results. For perl it is 2.2% and for PHP it is 13%. Either they do a lot for security of they have a big security problem. He build a small demo website giving the programmer advise based upon his chosen technologies. So: what do we think of the site and do we have input on python-related common problems and pitfals? Python is faster than assembly. Really.Speed is measured as distance divided by time. Someone at CERN said that only 4% of the matter in the universe is known. Well, we can say that there are a few things that don't exist: the tooth fairy, santaclaus and completely specified projects. So the only distance points that really matter are the business idea and the process is implemented, revenue is created. The time python takes between those points way less than with assembly or C, so python is faster than assembly. Niels Mache - Spreed conferencing applicationSpreed is implemented in python for some 70-80%. Webcasts, conferencing, powerpoint sharing, etc. Spreed is available as software, as a service and as a hardware appliance. Martijn Faassen - Update on lxmlLxml is a python xml parser. Lxml is high performance, it is pythonic and it has lots of features. Not many of the other python xml libraries have all these three. lxml builds on libxml2, but has a much more pythonic interface than libxml2 itself has. Some changes: there's a new maintainer that did loads of work. There is a 1.0 release and a 1.1 is in alpha. Lots of improvements. If you work with xml in python and you need high performance, pythonic api and lots of features: use lxml. luis Belmar - itools catalogitools has a catalog engine. Philipp von Weitershausen - properties versus decorators With "property" he means those python new-style-classes that allow
you to have just Decorators allow you to do things like: @property def feel(): #xxxxxxx Doesn't really help when you also have a setter. He tried something
with class JamesBrown: class feel(classproperty): def getFeel(...) #xxxx def setFeel(...) #xxxx Rob Collins - python software foundation and massagePSF protects the python intellectual property, funds some research and organises pycon. And.... Rob is going to raise money for the PSF tonight at the dinner by doing paid neck massages. (He's very good! I had one last year). Mark Mc Mahon - PyWinAutoDemo on what you can automate on windows with pywinauto . Raymond Hettinger - interactive activation and competitionTreat a database as a neural net. You can investigate relationships, compensate for missing data, extrapolate values, etc. Impossible to blog, all those screens of data, but it was pretty funny to see that it actually works a bit. He'll have the python code up on the cookbook this evening. One talk I didn't commit yet from this afternoon: Gaëtan de Menten - Tiny ERPTiny ERP manages accounting, sales/purchases department, stock/production, customer relationship management (CRM), project management, etc. The advantage is that it is integrated and extensible. The target market is small to medium businesses, they're not yet targetting the big customers. The project was given wider publicity in 2005. There are 5 full-time developers at Tiny and they have some 23 partners in 11 countries. One convenient feature is automatic partner segmentation: 20% of your customers deliver 80% of your sales, so tiny ERP allows you to filter them out. The architecture is client/server, but it is server oriented: all the logic is on the server. The database layer is postgreSQL, the object layer is python and the view layer is in XML. The client has almost no logic, it communicates with the server using xml-rpc. Workflows are also defined in XML, but you can generate an image from it. Framework shootout
Filed Under:
20 minute coding sessions by three frameworks (django, turbogears, zope3) followed by a panel discussion. There are four different frameworks being presented. Every one is given 20 minutes to demonstrate, code and speak to give the audience a feel for the possibilities
I'm not going to type in everything everyone is typing in at top speed into their editors. I'll just limit myself to some things that made me wonder, gawk with awe, shudder with horror or whatever. It's just my impression, not a quality assessment of the framework in question. Kevin Dangoor - TurbogearsKevin has a macintosh 12 inch laptop and uses the "textmate" editor. He'll create a wiki for us. Kevin dives right in and has turbogears generate a directory with a project skeleton for him. What looked strange to my ZODB-trained eyes was that he was basically defining database tables and columns. He defined it using some fancy sql-to-object mapper (ActiveMapper), but that's still defining SQL tables to me. Man, I'm spoiled with zope's object database :-) He starts off on a template and that's html sprinkled with '{&'-like constructs. He said at the end of his earlier presentation that he saw no value in xml-based templating systems, but apparently tastes differ. I kinda like zope's tal/metal templating system that uses html constructs and uses attributes to add functionality to them: <ul tal:define="results context/getWeblogEntries"> <li tal:repeat="result results"> <a href="" tal:attributes="href result/getURL" tal:content="result/Title" /> </li> </ul> Ah well :-) In one of his edit forms he had a very handy autocompletion if you start typing a value previously known to the system. That looked pretty OK. My impression: you've got to get to know it a bit, but it isn't that big. It looks fine. You can produce good websites with it. Elegant. Simon Willison - DjangoSimon has a 15 inch macintosh laptop (looking at the speed of the thing, probably a macbook) and also uses textmate . See also the summary of his earlier talk . During reheasing for this shootout he soon discovered that live coding wasn't something for him, so he prepared some parts for copy-pasting. That way he can give us a better idea what Django is up to. Simon also tries his hand at a wiki. When running in debug mode, django gives you a rich amount of debug information if there's an error in your application, which helps to to quickly spot errors. Django sure restarts/reloads/whatever FAST. Compared to restarting a plone site... Real Fast. Simon is a big fan of using django's capability of using base templates, which gives his pages a bit more layout. It's not the huge amount of user interface stuff that plone gives you out-of-the-box, but it solves the same issue of getting a good starting point. Handy: you can create custom template tags to make your templates shorter. It is a bit comparable to zope's template macro mechanism, but as it uses python code behind the scenes it has a very nice ring to it. It might be more powerful. Philipp von Weitershausen - Zope 3Philipp uses a 15 inch macintosh laptop. The editor is mighty Emacs. He's not going to create a wiki, but he's going to show off what zope3 is good at. "Zope2 was the Ruby on rails, django, turbogears of 1998." With the exception of using the whole object publishing idea that zope used. It is successful, look at Plone. Zope3 is a complete rewrite, especially aimed at serving python developers well and to play well with others. It is aimed at tackling complex problems. It uses a component architecture. There are a lot of existing components. Persistence, templating, i18n, sessions, cataloguing, forms, workflow, etc. To create an application you're going to glue together components (without having to change the existing code) and code a little bit yourself. URLs in zope are build by traversing objects. Many objects can be treated as filesystem folders that way. That's something very different from the regex url-to-method mappers from turbogears and django. Philipp creates a new zope product and defines a new folder traverser
by subclassing from To tell zope about his new class, Philipp adds a Restarting zope is slow as usual. The second part of the work is to create an adapter for ordinary simple documents that detects WikiWords. Couple of lines of code, register it as an adapter, finished. So Philipp took a normal folder and changed its way of handling content IDs (case insensitive). He also adapted the build-in simple document to allow wikiwords. All without modifying the original stuff. Great for reuse. Panel discussionWe had some time at the end so it was time for a panel discussion . Q: "Turbogears:What about releases? There are two different betas or betas out apparently". A: We need some more documentation Q: "Which approaches do you want to steal from eachother?" Django: Turbogears has a nice quickstart. Zope3: zope has nailed the the complex stuff right on the head, but getting started is hard, that's good in django and turbogears. Django: we want to steal zope3's permission system. Q: "turbogears and django look much the same to me Turbogears: more of a philosophical difference, turbogears is more geared towards reusing existing components. (And no, he's not comfortable in telling people to use turbogears instead of django. Matter of preference.) Django**: They occupy many of the same places, it is mainly a taste issue, where are you comfortable. Q: "Did you try interoperating?" Turbogears: we tried some django stuff that looked good, but it wasn't really documented. There hasn't really been any usage of eachother's components. Some of the external components that turbogears uses are also used by django. Q: "Spitting out different formats? Especially json/ajax instead of plain html" Django: We're using a template engine, there's no reason why we can't export what ajax/javascript/jason wants. Turbogears: Likewise. Django doesn't do anything with ajax (except in the admin stuff), that's outside. Your templates can do it, though. It is open. zope3: Of course you can throw another adapter at it. There's already a json adaptor for json requests that does most of the work. Q: "Any chance of merging django and turbogears in a year's time?" Django: I think not. Turbogears: perhaps some interoperability using WSGI, with part of the website handled by turbogears, part by django. Jim Fulton - zc.buildout
Filed Under:
Interesting talk by Jim about Buildout, a tool that helps you control your eggs-based development. It sounds instancemanager-like :-) At zope corporation they had problems with getting their heads around how to use eggs in a development environment. Their existing buildout environment was starting to show its age and it might just be the thing to manage those eggs. Eggs are self-contained zero-installation python module
installers. The hard part is:
Setuptools is the heart of all this, it helps you build eggs, it handles dependencies. easyinstall finds distributions on the net and installs the build eggs into specified locations. Some things he doesn't want:
Jim wanted greater control over the eggs that are used. Specific versions, so that they can test the exact software that they're going to ship. You don't always want the newest of everything. He figured out that the old zc.buildout could be revamped and be used to manage the eggs. Buildout creates an assembly of parts, for instance to assemble databases, zeo servers, app servers, etc on multiple machines. The initial versions were make-based, which is a terrible scripting language. A few months ago he started on a 1.0 that would be made in python and that would manage eggs. Buildout consists of recipes, which are small python classes or methods that do one thing (like installing something). Recipes are managed as eggs. So there's support for developing python software using developer eggs. Each buildout has a configuration database build with python's configparser, a simple example: [buildout] develop = mkdir parts = mkdir log-level = INFO [data_dir] recipe = mkdir path = mystuff It is under active development and it is ready for production. There will be recipes that are missing that you might have to write yourself. A near term goal is better control Note by Reinout: I've made a program that partially does the same things, though on an entirely different basis and aimed a bit more at plone: "instancemanager"https://plone.org/products/instancemanager . See also the weblog entries on that subject . Europython: domain specific language and bayesian classifier
Filed Under:
Two talks from tuesday morning from 2006's europython. Anders Hammarquist - Python as a domain-specific languageThey needed some new custom language for the BLMs (business logic modules) for their CAPS system. They had an older version, but that didn't have decent inheritance, the source files were too large, there was no introspection, etc. Why not do it in python? Well, there are no typing constructs and you
can't really add new keywords to the language (which they
needed). But python had everything else that they needed. So they
modified python a bit to fix it: metaclasses (pypy's They use two pieces of code from pypy :
What did they get? Python with some strange conventions. They gained all the python features. Tarek Ziadé: CPSBayes, naïve bayesian classifier for CPSBayesian classifiers are simple probabalisting classifiers. They are used for document classification, spam detection, text mining, data mining, etc. A bayesian classifier is given texts sorted in several categories (like "ham" and "spam") and will grab the words out of it. It then calculates which words are probable indicators of ham or spam. (That bloody Frenchman had an example with "winners" and "losers", with winner being France and losers Italy, Portugal and Germany) :-) The bayesian classifier needs to get the relevant words from the text (so: exclude "the", "it", etc.) and process them. Reuse: it uses textindexNG3's splitting (on spaces, tabs, points, commas), normalising (lowercasing, and for French, it removes all accents, for instance) and stemming (make everything single instead of plural). Their BayesCore is a pure zope3 product, CPSBayes is a tiny zope2 CMF layer around it. Tarek experimented a bit with it and you could use it a bit for automatic filling in metadata and for automatic linking between documents. Europython keynote: Alan kay - Children first
Filed Under:
Europython 2006 keynote on children education, the 100 dollar laptop, squeak and python. Alan didn't present in-person, as he was ill and they didn't want him to contaminate an entire airplane :-) he did the presentatioon over a video link instead. What if we put children first? He thought about it around 1968 and thought up something booklike, as children move around. Now he's on the 100-dollar-laptop board and there'll be such a thing What's the cost for your current laptop? 50% is profit, marketing, distribution, sales. 25% is microsoft software. Of the 25% that's left, half is the display, a quarter the harddisk. So they now have hold of a 40 dollar innovative screen. And they're using flash memory to get the harddisk costs down. This project is going to change some of the statistics: it is going to raise the market share and importance of both linux and python. "We see things not as they are, but as we are" (the talmud). We humans are sometimes totally incapable of seeing certain things. We're fooled by our brains. Just loop up a few of those perspective image jokes on the internet. A part of education is getting us to see things that we don't see by nature. Human universals (big list) boil down to a "story culture". They exist in all or almost all cultures. Non-universals include democracy, writing and learning, equal rights, perspective drawing, agriculture. These are especially hard to learn. He had a log of nice children-oriented demos made with the smalltalk implementation squeak, you can find the examples at squeakland.org . He hopes that a system like this will be implemented in python. There are many more python developers than squeak developers. And python is going to be pretty big in the 100 dollar laptop. Some 10 people have implemented virtually everything of this system. Alan asks every one of us to think very hard about putting children first and helping out with this. Update: Guido van Rossum also has a nice summary , be sure to also read the comments. Europython: internationalisation talks
Filed Under:
Two talks on internationalisation and localisation on the 2006 europython conference. Marc-André Lemburg - Unicode-aware applications in pythonThe unicode consortium's solution to encoding issues: one encoding for all text in the world. Ascii compatible, even latin-1 compatible. The often used utf-8 encoding encodes unicode into 8-bit "code units". Python supports it all (that is: unicode 3.2 support in python 2.3 and 2.4, unicode 4.1 is supported by python 2.5). Python's native unicode type is very efficient and its performance is equal to or better than normal string processing. There are a lot of codecs to encode unicode into latin-1 and so. Using unicode in python has problems. Not all modules expect unicode, you've got to encode your data then. Some operating systems also cause problems, especially for filenames. General principle: use unicode for all your text inside your application. Avoid mixing unicode and strings. So use explicit encoding/decoding in all I/O operation. Internationalisation (i18n) approach:
And: enclose all literals in a call to a translation function:
The most often used tool is GNU's gettext, available through the python gettext module. There are lots of tools for it. Egenix (which Marc-André is the boss of) do it a bit differently with an on-the-fly approach with translations stored in a database. Philippe Bossut - Internationalisation in python with pyICUInternationalisation is hard, there are multiple things to keep in mind. When displaying a few dropdowns for type, date and time in a sentence form for an appointment application, you have to remember that the sentence order changes per application. Sort order varies per language. a-z is clear, but where does ä go? Chandler, the case study in his presentation, is a PIM (personal information management) application that wanted lots of good i18n and l10n. Chandler 0.5 only supported ascii, had hardcoded date format strings, everything was English, etc. A typical US-originating open source application :-) ICU is a mature set of c/c++ and java libraries for unicode support and internationalisation. Unicode text handling, unicode regular expressions, date formatting, locale-dependent sorting, etc. Looked OK. So in May 2005 they added python bindings to the c++ ICU libraries by using SWIG (a wrapper generator). They have a hand-coded leaner wrapper now, which is quicker. They only wrapped the parts of ICU that they needed themselves for chandler, btw. There's one ICU part that they don't use and that's ICU's translation
mechanism, they used the gettext mechanism instead. It is used a lot
in the open source world. A handy method to deal with message strings
that include About the infamous "UnicodeDecodeError": Always do the conversion at the I/O boundaries. He said it, Marc-André said it: do it. Chandler has a lot of I/O boundaries. http, webdav, filesystem, etc. Nice idea: they're planning to provide their translations via python eggs. Europython: eXtreme management, design by contract, bebop groupware
Filed Under:
Afternoon talks at europython 2006. Maurits van Rees - eXtreme management of projects in ploneMaurits made an eXtreme management application to manage XP projects. Main element of his presentation is XP's planning game . The planning game is about having short (1 till 3 week) iterations, stories, tasks, etc. instead of One Big Project. eXtreme management has separate roles: manager (can do everything), employee (can't mess up the project planning after start of an iteration) and the customer (can't change things after work has started). At Zest software, we normally give managers and developers that role site-wide, customers are added locally to the projects, so that they can only view their own projects. Stories can be added by both managers and customers. The initial workflow state is "draft", after the story is well-described, the customer can submit it for estimation. Rough estimation (by the manager and the developers) is done on the story level and is measured in days. You get a more detailed estimate by adding tasks to the story, tasks are measured in hours. Developers add those tasks to the story and estimate them. The total of all tasks is the real estimate for the story. There are a couple of checks in the system: you cannot activate a story without first adding tasks and estimating them. When you do activate something, everything inside it also gets activated and ends up on the to-do list of the developers. Developers can book their hours directly on the tasks, which also shows up to the customer: an unknown level transparency for the customer. Aaron Bingham - Design by contract in python: present and futureWhy design by contract ? Precision and accuracy in documentation, so quality. The documentation cannot get out of sync with the code. A common question is "why do we need it, we already have unit tests". Design by contract adds to unit tests, does different things. It is stronger at catching integration issues. And it allows you to code less defensively, as any uncertainties are catched by the contract. Less defensive code means less code, which is good. The three essential parts of design by contract:
All three are checked before and after every call to a class method. In the original (eiffel) definition, it was qualified calls, but we don't have something like that in python. Python's There are some people who tried their hands at an implementation for python. He demoed his own prototype (aspects), pep 316 has something, Plösch, ipdbc, pydbc. In the end he compiled a table comparing all the approaches. Conclusions on the state of implementations. A solution using decorators would be interesting for comparison. Only the PEP and Aspects are workable solutions right now. Aaron's conclusion, in the end, is that we're not far off. Uwe Oestermeier - Bebop, a zope3-based groupwareThey packaged their zope3 server as an out-of-the-box downloadable application, which looked like a userfriendly way of distributing it. One of the things that Bebob manages is documents. It keeps track of versions and the various users are notified that they have to download new versions if someone uploads a new one. There's also a blog, a wiki and a message server (I missed the first 5 minutes of the talk, so I might have missed some additional ones). Bebop has a central catalog with indexes for authors, documents, versions, dates: who did what when? Their current setup is a central ZEO server, the wxPython client connects via ZEO, the metadata is the zodb, the actual files are in the local filesystem. They are thinking of using twisted instead of ZEO. Europython: wsgi, django, moinmoin
Filed Under:
Talks from the web frameworks track on wsgi, django and moinmoin. James Gardner - Developing applications with WSGIMurphy had a field day with the projector. It worked perfectly till the presentation started. A big "clean the air filter" notice appeared and couldn't be removed. Ok, let's switch the channel. Ouch, now we're just getting some faint gray lines. Murphy's law. Kit Blake got the projector tamed again in the end :-) His talk is much more low-level than Kevin's one, he'll basically go through the WSGI PEP . The problem: choosing a web framework (zope, quixote, webware, twisted, etc) is tricky as you have to like the whole of the web framework as they used to come as big monolithic blocks. The WSGI interface allows the web frameworks to chop themselves up a bit and make individual parts available. The WSGI itself is pretty simple. Your application should accept an
environment (including form variables, etc) and a The interface is easy to implement and it makes your application WSGI-usable with lots of other applications. You can write a separate library that does something, register it as WSGI and use it from any other WSGI-expecting app. There's a new website at wsgi.org that has an overview of available middleware, utilities, documentation, etc. Simon Willison - The django web frameworkHe worked for a newspaper: web development on a newspaper schedule. That newspaper Went Wild with the kids' league . They wanted a professional site for the kids' league: team descriptions, team stats, schedules, sign-up forms for parents to get email alerts, 360 degree photos of all stadiums, etc. In three days. In the end, this turned into django . They had a great css/template designer and a couple of very capable and hard-working interns. And they were going to build it in python. And they were going to have clean URLs, a good template system (so that they could get the designer to work right away: remember, a 3-day deadline). Views are what django developers spend the most of their time in. Methods that accept a request and send out a response. That's where the work really happens. Views are developer-oriented and thus can use lots of little utility methods. URLs are handled with regualar expressions. If an expression matches, the request gets given to the corresponding method. The regular expression can grab certain elements out of the URL, these get passed to the method, too. Django uses databases to store data. It can create the databases for you, which also gives you some handy methods to get your data back out of the database. One fancy thing is that most of the queries are lazy: they don't get executed till the last moment. So you can grab all data, take the first three of those and print the results: this will only query the database for the first three items. Django templates are normal html files, sprinkled with curly braces to tell django to do things like iterating. You can have a base template that defines some extension locations that you can fill later on, for instance for putting a site header and a standard footer in there. Django's template system was designed to handle the environment they had at the newspaper. What they did for the newspaper turned into a lot of small mini content managerment systems all bundled together. So they started a central management view to coordinate it all. Some success stories:
Thomas Waldmann, Alexander Schremmer - MoinMoin wiki developmentMoinMoin is a pretty popular wiki implementation, for instance used on parts of python.org. Ubuntu uses MoinMoin for almost all their websites, so that shows you can modify MoinMoin's looks a lot, as it doesn't look as a wiki. Thomas showed the MoinMoin plugin types. Parsers, macros, etc. MoinMoin now has a nice plugin system, allowing you to modify individual parts of MoinMoin to your liking. Other parts are not plugins, but are meant to be extended by using interfaces: user authentication, convertors (from html to wiki markup), security policies (antispam, autoadmin, etc.). For the actions, they have plugins. Actions are "show", "diff", "delete page", etc. Also plugins: parsers and formatters. Formatters are called for individual pieces of text, so italic text goes through the italic formatter. Adding formatter actions thereby extend the capabilities of the format. Theme plugins customise the way that MoinMoin looks. Such a plugin consists of a bit of python code and stylesheets and icons. MoinMoin supports Wiki xmlrpc: some standard RPC for getting/writing pages, info, page list, etc. Europython: Kevin Dangoor - Working together on the web
Filed Under:
Nice talk about re-using existing modules when developing a web application. Kevin build an RSS newsreader while recovering from perl and java: in python. There's a lot of stuff available in python, it comes with batteries included. Everything that was in ruby on rails (which he also looked at) was already available in python, it just wasn't packaged together like rails. Two additional big plusses for python:
Kevin introduces a hypothetical programmer "J" that wants to build a virtual circus. Hey, you need an example :-) Kevin convinces J to not write his own framework. Python has the WSGI module, so J can concentrate on the circus instead of writing a webserver: he only has to adapt to WSGI (pronounced "wiskey" according to Kevin). Next up is some state-persisting, so talking to databases: middleware. WSGI has interfaces for that, so he can go to the cheese shop and search for WSGI and pick out a module to his liking and start implementing his end of the WSGI interface. J can use python Paste to plug various WSGI parts into eachother. Paste is sort of an application generator. Look at that if you need something like that. Is it worth learning all of those things yourselves or are you better off taking an exising library? For Turbogears he did build the widget/form system himself as he couldn't find anything that fit in nicely with the rest of his system (he wanted to use the KID template language). Packaging can also be a big task. Lots of different OSs, especially with pyhton. If J packages everything into one big tarball, he'll have some happy users, but others will complain about module version mismatches with their already installed stuff. RPMs get some linux users happy, but leaves windows out, etc. Solution: setuptools, easyinstall and eggs. (Reinout: yes, works nicely). It is a sort of debian version/library management system for python modules that allows you to easily distribute your modules. You can have dependecies on other modules, including requirements for minimum or maximum versions of those modules. Also handy: eggs can have "entry points" for plugins. One egg defines
an entry point, other eggs then can provide plugins for those entry
points. Plug and play. For instance, the KID template languages
registers itself into the Back to the circus. J did choose
Beaker as his
way of handling persistence (in this case for sessions). Beaker
registers itself with easyinstall as a plugin for WSGI's persistence
interfaces. Beaker requires the myghtyutils module for handling
sessions itself, but that's just one
Kevin: "no talk about web applications in python is complete without
talking about zope3 :-)". Zope3 is about reusing existing code, so
Conclusion. Kevin doesn't want to discourage people to build their own better templating systems, but then don't write your own parser, there are many of those. If you think the existing web frameworks are crazy, by all means build a new one, but then don't write your own webserver, session handling, etc. ArchGenXML introductory talk at europython
Filed Under:
I get to give an archgenxml introduction at europython 2006. I just got the europython 2006 program and my talk is in: Generating content types and workflow with ArchGenXML , hurray! Monday morning, right before lunch, which is a good spot as far as I'm concerned. For many people, ArchGenXML is the most attractive way to get started with Plone
development. Generating your content types from a UML class diagram is easy and fast,
especially as ArchGenXML sets up all the "bookkeeping code" for you: the ArchGenXML is great at generating a complete workflow out of UML state diagrams. You get good code that hardly ever needs modification from a UML diagram that you can readily show to your customer - and he'll be able to understand it (mostly). I'll show how to do this in the presentation. Depending on the improvements made on ArchGenXML I hope to show our support for zope3 goodies in plone 2.5. Oh, and allow me a brief recommendation of eXtremeManagement of projects with Plone that my brother Maurits gives on monday right after lunch :-) Europython 2005 zope lightning talks
Filed Under:
Wednesday afternoon zope lightning talks from the europython conference. See also the complete write-up overview . Included is also a talk about the zope foundation. Rob Page - Zope foundationAll the info is on the web, so just the questions afterwards. Q: was there any thought on making it a European foundation instead of a US-based one as there are more zope developers in Europe? A: No. But we want to try and make it tax-deductable, for instance, for European companies. Q: where will the money of the foundation be spend on? A: funding sprints, paying for zope.org, perhaps funding some developers. And you need someone to answer the phone. Q: will the Z3ECM project be in the foundation's code? A: don't know. It's just by accident that these two things are underway at the same time. Q: issuing rights to use the name zope? For most cases (local zope users groups) it will be the foundation that does it. (lightning talks below) Lightning talks Astrid - great presentationsAs an outsider, she's allowed to ask whether Pythonians have a heart. When talking to people: yes, she's sure. But when listening to talks: she's not sure. Why do you hide your heart when giving a presentation? Don't give a presentation when it doesn't come from your heart! And if it does come from your heart, show it. That gives you 85% of the attention of the participants.
Eyes, mouth, arms, legs, heart: that's your recipe for great presentations. Andreas Jung - textindexNG version 3Great new feature: it works well with multi-lingual content! Also there are now complex queries over multiple fields. Christian Theune - blobBlobs for zope seem finally on their way. Andreas Jung - zope + xsl-foCurrent printing is mostly based on pure browser printing. Layout may be wrong and there's no hyphenation. XSL-FO is an XML dialect for formatting layout (mostly towards pdf). There is a tool called CSS2XSLFO that takes a html page including css and makes XSL-FO out of it. He did a first implementation that converted 10000 documents which looked very nice indeed. Joost Roman - Blogs for silvaQuite simple. Just adding silva docs to folders and so. Tres Seaver - ZeleniumMost was covered in Maik Roeder's presentation. One thing that was not covered was how to to test against a test instance without fouling up all subsequent tests if there's an error in the current one. There's a DemoStorage possiblility in zope that you can reset quickly to get a good begin situation back. Bit underdocumented, but basically you have to put '' tags around ''. Just copy your filestorage and run them on a different port (or something like that, it was quick). Works in zope 2.7.6. Matt Hamilton - pdbQuick intro to pdb, same as last year. Some emacs tricks: Slap that "Don't use Michel Pelletier - ZemanticZemantic is really just ZCML and page templates, no logic. Just a wrapper around rdflib . The power of zope3 in action! Zemantic actually used rdflib 2.0, but it has heavily influenced version 2.1 :-) RDF is just a standard way of encoding (not implementing) Cobb's relational database model. In many cases you can use Zemantic mostly as you would catalogs. But for catalogs you need to create indexes and so you need to know beforehand what you want to put in there. You can only put in there what is allowed. Zemantic is much more civilised. Zemantic content describes itself and you can shove almost anything in. For normal usage: use the catalog. For data-centric apps: use zemantic. The "Sparql" query language will be included in the next version! Kai Hänninen - PrimaGISWeb mapping application in plone. They're trying to combine traditional spatial data with content from the CMS. So you have to allow existing and future content to have spatial parameters. ZCO Carthographic objects for Zope, PrimaGis for plone and the python carthographic library mapserver in python. All of them are quite new, but already highly usable and actively developed. Christian Zagrodnick - CMFLinkchecker and link monitoringIt's actually for plone and not for plain CMF... When viewing a page as an author, you see the links in red, orange or green. Green is good, red is dead and orange is temporary down. Handy. There's now also a link monitoring thingy that show you what links to a certain object exist. The link checker runs as a separate (twisted) server. Aisté Kesmitanaité - Ivija360, zope3 applicationIt's about 360 degree feedback for/from employees for HR departments. Getting feedback, giving feedback. They're using reportlab to send out nice shiny PDF forms with the results and the nice graphs. Roman Joost - graphical editor for alphaflow workflowGraphviz for layouting the graph and SVG (mozilla) for the visualisation. Couple of issues with SVG implementations and consistency of the UI, though. Lucky meYesterday I won a python Tshirt, today I won a 12y old malt wiskey for filling in the feedback form :-) Andreas Johnsen - Killing sharepoint zopelyWith storing word 2003 docs as xml, you can already build nice imports for zope. But, they hope to be able to use the sharepoint functionality in word. It looked OK, but took the latest .net, the very latest visual studio, but then you were able to show a part of the zope interface in the sharepoint sidebare in word. (Or so, I'm not known with sharepoint). Christian Theune - msaccess ploneHe made pymdb that reads in mdb tables. Plus a plone product. You can upload access database and select tables and so. He demoed it with the Tmobile database of hotspot locations. Very neat, especially when searching through the standard plone search box. It also matches inside the database and, when clicking on the database, it only shows the matching field. Real neat. Duncan Booth - plone multisite(Look at "composite page", he said it was good for mixing and matching parts of a page). Multisite demo Vincenzo Di Somma - PloneWorkflowsA collection of workflows that should cover a lot of the commonly occurring scenarios. Great. Alexander Limi - Presentation HubrisHow to present with a minimum of fussIn plone there is a plone side mode. Opera supports it h1, h2 means new slide. Bullets, defs are ok. PTProfilerptprofiler is one of the most useful tools there is. You can profile your page templates! This allowed plone to kill lots of the things that took time. Warning: don't install it on a production server, as it re-enables itself when you restart the server. Martijn Faassen - developer marketingHow (not) to build an open source community. When you find a piece of software on a company website, would you give feedback? Would you submit a patch? Would you join the project? Ask yourself the questions again when you find it on a developer site (like codespeak.net). Perhaps there's less chance of trying the software as you're not sure of the reputation, but you'll be much more likely to join the project. This one difference can make all the difference. Godefroid Chapelle - Ajax infrastructure in zopeXforms and so is nice, but not really supported now. Ajax is the quick solution, BUT you don't want to support JavaScript. So something javascript-generating or so would be nice. Azax uses a combination between a javascript preprocessor and commands written in XML. Europython wednesday morning
Filed Under:
Wednesday morning talks from the europython conference. See also the complete write-up overview . Included here: research on fun and software development; sprints; choosing good names; selenium functional testing; using tests for motivation. Benno Luthiger - Fun mattersResults of the study "fun and software development" (FASD). Sometimes software seems a by-product of some process that provides fun :-) One of the hypotheses was that "developing open source makes more fun". What is fun? For his research he needed a more unequivocal term. He uses the term "flow" (from Csikszentmihalyi). Flow is characterised by knowing what to do (like in sport), having the right challenge, a good match between capabilities and requirements, concentration, a perception of increased control, etc. He did his research by making a questionaire for open source and developers of purely commercial software. (Hm. I'm wondering now whether I've filled in this questionaire: I remember something like that...) Funny: apparently there is a linear relationship between "fun" and "engagement for open source". The more fun, the more readiness to work more on open source. It doesn't wear off quadratically with more and more fun. After a complaint about calling it "commercial" instead of "proprietary", he explained that he asked open source programmers in general (whether they got paid to do OS development or not) and developers in a couple of Swiss companies. Some differences: in open source you've got an optimal challenge (you do what you can) and you've got more project vision. Commercial: more formal authority, monetary incentives and deadlines. (I didn't get everything, I didn't understand the lists of numbers that weren't too well explained. Ah well, there's probably a paper that does it.) The results are here Beatrice Düring - PyPy and sprint-driven developmentShe showed a picture of a school class sitting neatly in rows, looking uncreative. That's the optimum of 1000s of years of educational development?!? Definitively not how our open source projects are looking. The PyPy projects makes a nice python-in-python compiler. They're now funded by the EU for two years. They have to do research also on agile methods. One of the things is a "sprint". Originated at zope corporation for the zope3 development. A multiday session of intense development, 2-5 days long, no more than 10 people and using aspects of extreme programming. Some good points of sprinting. The productivity comes mostly from teambuilding. Imagine a ball falling and falling and falling... You need to bounce back up, otherwise the developer leaves. During the "falling" stage, you orient yourself ("who are you"), start to build trust, allign the goals and roles. And then the bounce: commitment. Then moving up: implementing, WOW, restart (do we continue, do we do something else, etc.). This means we need to structure the sprint. You must have to have a process, steer it a bit, have some social skills to get a successful sprint. How is it done? There is actually quite a lot of info on the net (mostly zope-centric) on how to do it regarding content+logistics. You need the infrastucture: connectivity, coffee, a room. Otherwise the sprint won't work. On procedure: have an introduction, followed by a tutorial to get new contributors up to speed. Important: tracking! Track what everyone is doing, keep showing interest. Track against the goal. Michael Hudson once asked whether it was actually possible to do open source sprinting 10 years ago: no ryanair, no wireless, no affordable laptops? He had a point. Within PyPy they try to learn by doing - by reflecting on what they've done. Trying to design a process, to adjust the process. And disseminate the process. So document the process. Funny: they're also working at integrating project management (=meetings) with the software development during the sprints. Issue: functioning within EU funding. There are challenges regarding sprinting. For instance expectations and participants: the expectations must not be too heterogenous. Vision versus implementation: during the sprint you're creative, but most ideas won't get implemented during the sprint, so they have to be done in between the sprints. This must be taken care of. Issue: leadership and process management, which depends a lot on the participants - whether they're proactive or not. Another issue: funding. Ryanair and hotels cost money. Especially for unfunded open source projects. If you've got experiences or comments on sprinting, please help Bea with your info ( bea@changemaker.nu ). Stefan Holek - Choosing good names"And God called the light "light" and it was good". Naming things is not just something God is allowed to do, programmers do too. After showing sheets with some assembly or internal computer bytecode, the need for "names" as such was pretty clear :-) Humans deal with complexity by means of abstraction. So the computer still uses numbers, but we've put a layer on top of it. First procedural languages, like COBOL. Comment about COBOL: devised so that management could read the code that the programmers wrote... Even further: object oriented programming. Object orientation gives you arbitrary levels of abstraction. With the tools that we have available and the possibilities that we have, when programming, our audience is human. The computer doesn't care less what we do. And the machines are fast enough not to have to use machine-centric languages most of the time. This is the core of his talk. Observation: there is no substitute for experience and taste. You notice it when code is written by beginners! For communication you need a common vocabulary and so. For good names we can look at lots of existing software. Input comes from computer science, the problem domain and our experience. The "gang of four" software pattern book has an important contribution. Not so much their explanation of how to implement the patterns, but naming them. Now we can talk about an Iterator or an Adapter and get a pretty good idea of what is meant. When coding, we are creating instances of our tools (lists, classes)
and we use names from the problem domain. So a list could be called
Hints. Code must be understandable by itself, comments don't help. Use English language names. So know your English. Be consistent throughout the program, the specs and the docs. Do not rename things halfway, that is counter-productive. Short names are good, but don't overdo. Saving keystrokes is not
a strategy, though. No In your code, tell a story. A set of plain numbers in an argument
list is way less "telling" than keyword arguments. Module names should inform about the contents. Lowercase them. Martijn Faassen says that module names should always be singular, that way you don't have to guess. Classes start with an uppercase character. Often it is UpperCasedLikeThis. Method names signify what the method does or what it returns. It starts with a lowercase. Either mixedCase() or under_scores(). The first part should be an action like trim_spaces() not space_trimmer(). Variable names inform about the value. Booleans should start with
Like testing, we choose good names primarily for our own benefit. Maik Roeder - Web application testing with Selenium(I ran in five minutes late, so I missed part of the demo. Wrong room :-) There was a lot of good demonstration, so I won't write much here. By the way: the room was full, a good indication of the testing-readiness of the developers present, which in turn is a good indication of the probable quality of the code!) There are two modes: in-browser mode with the tests run by html+javascript in one frame and the website under test in an other frame. The tests are written in html tables. The other mode is the "driven mode", with the browser being steered by an application on the same machine. That program has a python interface. Selenium has actions for opening a page, clicking somewhere, typing
in text somewhere, and so on. Check can be for text that's present or
not present, for instance. How do you locate elements (for
clicking/text entering)? By identifiers ( Selenium uses a javascript "bot" inside the browser. There is no need to change the core selenium if you want to do customisation, as it checks for a selenium-custom.js or so on startup. Tres Seaver made Zelenium, a zope product to make this more easier for the zope world. PloneSelenium is also an alternative especially for plone. PloneSelenium has a portlet which you can use to create tests. The tests consist of one python script per test. For plone 2.2 there is a plan (plip 100) to add selenium testing to plone. Great. Johan Andersson - Using tests for team motivationTesting by hand is boring. Automated tests might be boring. But you want happy customers. Those customers, though, aren't cheering you on 10 times a day. Automated tests, however, can cheer you on 10 times a day! See a running test as a customer cheering you on because they're getting good software. Extreme programming has a big emphasis on unit tests and not so much on acceptance tests. They found out that unit tests made them sad and acceptance tests made them happy. Those focused on customer approval instead of on programmer intent (a unit tests tests that you did what you intended). Acceptance test typically take a long time. Ouch. The long cycles resulted in more "coding by guessing" instead of "coding by testing". Hm. That needed to be faster. They tried different things to get the time down, but in the end they just piled on a lot of machines. Hardware costs less than people. They already had the infrastructure to distribute calculations over different machines. Tests slow? Throw in more machines. Some tests (at least in their business) took a lot of time and effectively were run outside the edit-compile-feedback short cycle. They had to compensate for that. Really important: write a test first to demonstrate the wrong or missing functionality, start satisfying the test afterwards. This really strengthens all the other extreme programming points like collective ownership, simple design, continuous integration and refactoring. Regarding retrofitting tests onto an existing application: with unittests code coverage is an issue and can be a massive undertaking. Introducing and modifying tracing output to become more deterministic is much less work and can be gradual. Testing for customer approval is thus probably much easier to pull off. The key issue for agile teams is social skills and the ability to adapt to a highly communicative environment. Some tips for the road to agility. Number one is to establish automated regression testing. Learn to estimate in 1 to 3 day pieces. Once the pieces are small enough, you can have integration meetings (every 2 or 3 weeks) and daily standups. That's also nice for the manager that has a much easier task of tracking everything! Continuous integration, having something running all the time. Let your tests drive your development (they still haven't implemented that 100%). On pair programming: see it as continuous code review! Consolidate test coverage to enable couragous refactoring and collective ownership. Important (according to him): Acquire a customer on-site to use user stories. User stories are normally short, just a few lines. When you actually start to program, almost always you've got questions you want to ask to clarify them. Comment from the audience: pair programming is like rally driving. One person is steering and pushing the gas pedal, the other one is reading the map and showing the way. In pair programming, one is coding, the other is thinking strategically, keeping the goal in mind and tries to maintain the overview. (Hm. As a human you have only about 7 things you can keep in your head at any one time, so pair programming might give you some extra ones, say 10 in total. And one of the difference between ordinary people and really smart and productive people is the number of things they can keep in their head. 7 is the average. 8 is a lot. So if you're pairing and get to 10 or so... Dunno, just philosofying.) Johan: I spend about 30-40% of my time pair programming. The rest is administrative. debugging and so on. But. Every piece of code that gets into our system gets in there by pair programming. Europython tuesday afternoon
Filed Under:
Tuesday morning talks from the europython conference. See also the complete write-up overview . Included here: plone4artists, zope CMS projects discussion, my talk, accessing huge distributed data sets and component-based programming. Nate Aune - Plone4artistsPlone4artists is a pre-customised plone site that is an out-of-the box specifically for artists. Especially artist community sites. A lot of use is made of the zope support for WEBDAV folders. This makes integration with for instance a pc (mac/linux/windows) handy. Likewise, integration with calendar applications via iCal was done. A funny thing is "plodcasting", a podcasting product for plone. This way a lot more use can be made of the lots of contents that gets uploaded to the side. As it is for artist community sites, they integrated the "creative commons license assignment" product to assign the correct CC licenses to the various bits of content. Nate recently started looking at archgenxml UML-driven development (yeah!). He showed some examples, like easily generating a new member type with some more attributes than the normal name/email pair. They're planning on re-purposing an iPhoto-to-flickr open source project to be able to easily publish files from your mac iPhoto application into your plone4artists site. Question from Paul Everitt: "I don't believe that you've got WEBDAV to work". Nate acknowledged that it was a lot of work and that it more or less halfway works sometimes reliably. He's been looking at PloneMall, which is a plone product for electronic shopping. This could help both the artists and the site to earn money by selling merchandise. Zope3 ECM panelParticipants: Tres Seaver (CMF), Philipp von Weitershausen (zope3), Steve Alexander (zope3), Martijn Faassen(silva), Joel Burton (plone), Florent Guillaume (CPS). The idea is to cooperate much more as currently customary among the various CMS-like projects in the zope3-time. Florent: zope3 is suitable for many things, what we want to code cooperatively is only content management stuff, we want to focus. Martijn: the ECM discussion has gotten into too much of an political discussion, he wants to have more technical discussion. Joel: Plone is focussed on an out-of-the-box good CMS experience. He hopes that ECM allows them to focus much more on that instead of on the more framework-like things like archetypes. Comment by Limi: it would be great to give the low-level technical people inside the plone community a better place to put their infrastructural work. Steve: Focus on the python code, don't focus too much on the zope database. That might not be the best choice in every situation. Philipp: the development model of zope3 might be something to immitate. Zope3 is already very good and elegant. The focus of ECM should be to lift zope3 up to be also usable for CMS. Also: don't be too afraid of throwing away existing, working code. They did it also within zope3 and that worked out well in that case. Tres: one of the issues of projects like this is a need to trust eachother's code, so a lot of testing is needed. Also: the existing projects solve real projects, so we've already got the use cases. And that is often the hard part. There was a bit of discussion on backward compatibility. There was an attitude in zope2 to be really really backward compatible. Do we need to try that hard (which has drawbacks) or can we aim more at good migration support? Limi: look at the PEP or PLIP process, with strict deprecation rules. That should be good enough. Florent: at the moment the number of core zope3 developers is a bit low, so the system lay low for a while. Martijn: for ECM it's only needed the moment we actually have something which we want to preserve. Martijn: we really need a solid, fixed release schedule for the core parts. The current situation is not good. Steve: frequent releases are important for packaging with OSs like Ubuntu, as they only Martijn: re-use can also happen on the python level. Python modules will work in 8 years time, but a zope module? Probably not. Reach out to all those python developers. So, also for ECM, try to look much more to python. Steve: for something like ECM you need to have "conceptual integrety". And for that you almost need a "zope pope". A single vision. Watch out for grabbing several fragments and putting in one ECM project. Martijn: but... you almost need the python reach-out. Martin Aspelli: you have "conceptual integrety"-problems at various levels. Different naming conventions on the low level for instance. Also the way in which you use several components on the higher level... It is important that the examples that are the first things people look at are pretty much consistent. Question: is ECM aimed at the enduser or is it a stepping stone for the plones and the CPSs? (I expected "plones and CPSs", but there was discussion on versions instead). Also lot of discussion on software stacks. In fact, Plone and CPS use mostly the same stack in the sense of a recent python, the latest 2.7 zope, cmf 1.4 and so. Limi: a document on how to package zope would be useful. Debian, gentoo, etcetera: you don't want to know how they package it.... (The rest of the discussion was a bit hard to write down). Martin Aspelli: do we need to build some community around ECM? There is the risk that it looks like a project for Really Good Programmers, thereby scaring away potential developers. Martijn: I'm taking this very seriously, this is something that we need to actively avoid. Reinout van Rees (myself...) - Plone used for semantic web in the construction industryI won't write my own track down, but the sheets plus complete text are available as a PDF . It is actually a better story on paper than how I presented it. Well, it is a bit of a scientific subject, which is always hard. As long as some people are going to take a look at archgenxml, I'm already happy. And if people start to export more data as computer-readable xml, I'm more happy. And if somehow someone got interested in my plone+semweb+construction work... Extatic :-) But after every presentation I think by myself "why did I ever do this". Steven Johnston - Storage Resource Broker (SRB), Large scientific data and PythonBasically: about storing lots of data. Problem: how to manage large scientific datasets? A student generated 30gig of data and you either just leave it where it is or throw it away. So, you loose a lot of data and need to manage it somehow. A scientist doesn't like databases and doesn't like interacting with them, not to mention on the SQL level. So Steven's trying to bring the modern database world to the researchers. He's in the BioSimGrid project that tries to make it all easier for scientists. The biomolecular simulations are often 10GBs of data, but they can be split into two groups. Metadata, which is small, with time, date, parameters. They'll access this a lot. Second the actual molecule simulation data which is huge, but not access that often. So they made a grid with some 45TB of capacity split over 5 sites. Some key features: you can deposit data at any site, read date from any site and there's redundant data replication (it's just copied to one other node). They've got a set of python analysis tools for accessing and mining the data in the grid. The metadate is small and accessed often. So that's replicated to every site in an oracle database (which was a political choice). The simulation data is stored in the SRB (storage resource broker) which is more or less like a distributed file system. A key point is to use the right tool for the job. The simulations results don't need to be stored granulary in some database format, the researchers "just want their big blob of data back". So: search the database for the metadata and retrieve the file afterwards. SRB. Not really a distributed file system. More like a bittorrent-like thing. There wasn't a good interface to SRB. He's currently working on an object oriented python interface. So far it's got an SRB connection, standard python file object support and so. But still... pulling 30gig over the wire isn't too handy. So he's got some ideas to associate code with files (the code is normally python) in order to calculate the results at the database. The results (mostly pretty small) and the parameters are stored, so you've got the possibility of caching and sharing of results. As the process is pretty much automated, it wasn't that hard to deal with the research calculation grid: the possibility to farm out big calculations to the calculation grid. Raphaël Marvie - A Simple Python Framework for Introducing Component PrinciplesRaphael is normally dealing for distributed computing. When we're dealing with objects, we're programming in the small; with components we're programming in the large. Components do not replace objects. Components have contractually specified interfaces and an explicitly stated external context*. Explaining components to students is hard, as the existing stuff is either too complex or too specific. So there is a need for a "component activity kit": picolo. It is written in python and the core is, partly thanks to python, pretty small. 300 lines, so students can undetstand all that goes on. (He gave a small demo. What I got out of it is that a component architecture is aimed at keeping everything neat and well-defined. Nice to see this presentation after some presentations earlier today where the zope3 architecture was explained - which uses a component architecture! And, yes, to keep everything more neat and well-ordered.) As zope3 is component based, I asked whether it was a good idea to put the configuration of the connections between the components in a separate (XML) file, like zope3 does. According to Raphaël, that was the conclusion of a component-related conference a while ago. Making the components and tying them together are two separate concerns. |
|
The Plone® CMS — Open Source Content Management System is © 2000-2007 by the Plone Foundation et al.
Plone® and the Plone logo are registered trademarks of the Plone Foundation. Distributed under the GNU GPL license.
