| CARVIEW |
The international SAT competition is a satellite event of the SAT09 Conference.
Main competition
News
- July 4, 2009
- the results of the various tracks are disclosed! Take a look at the presentation made at SAT09
- May 4, 2009
- final selection of benchmarks disclosed!
Rules at a glance: Selection of winners is based on a two-stages ranking, with an important increase of maximum allowed cpu time for solvers entering the second stage. Benchmarks are expressed in CNF (Dimacs format), and partitioned in three categories: Applications, Random and Crafted. In each category, there will be three specialties: SAT, UNSAT and SAT+UNSAT. There will be one winner by category and specialty. Judges will make important decisions on the basis of anonymized informations. To be eligible, solvers must be available in their original source code format. Binary solvers will be accepted only for demonstration. Rules are not strict, thanks to judges decisions. See detailed rules below.
- Organizers
- Daniel Le Berre and Olivier Roussel
- Technical Advisor
- Laurent Simon
- Judges
- Andreas Goerdt, Ines Lynce and Aaron Stump
Important dates
- March 1st, 2009
- Solvers and benchmarks submissions for the contest are open.
- Follow this link to register for the competition
- March 22, 2009 (UPDATED)
- Solvers and benchmarks submission deadline.
- June 30 - July 3, 2009
- Results disclosed
New this year
- Absolute scoring SATSPEC09 (does not depend on other competitors,
as opposite to previous scoring): lexicographical ordering on the benchmarks solved and the sum of the CPU time taken to solve those problems.
- Submission for the contest is accepted only between the 1st and the 22th of March. Any solver or benchmark submitted before will enter the contest, but will also be made available to the whole community as early as possible (see participation of organizers section).
- Authors can choose freely to submit a 32 or 64 bit solver.
Participation of Organizers
This year, and accordingly to judges decisions, organizers and technical advisors are allowed to submit a solver in the competition division. Fairness of the results will still be ensured because:
- Organizers and Technical Advisors that want to participate must
publish a MD5 hash number of the archive of their source code
(Competition division) or their binary (demonstration division)
strictly before the 1st of March (benchmarks and solvers submission
open). The version entering the contest must be the same. The goal is
to prove publicly that organizers submitted their solver before having
access to any solver or benchmark submitted to the competition.
Here is the list of solvers submitted by organizers:
- SAT4J CORE 2.1 RC1 by Daniel Le Berre (file=org.sat4j.core.jar md5sum=ff32b730678ee8f8f7b2c32a7ecec67f)
- glucose 1.0 by Gilles Audemard and Laurent Simon (file=glucose.tgz md5sum=88a153121dc6ec557329133c3f712fcd)
- The set of "organizers" is reduced to the strict minimum, e.g. those that need to have access to benchmarks and solvers between the submission deadline and the real contest kick off. Judges and Technical Advisors do not have access to submitted benchmarks and solvers before the contest starts.
- In each category, the final set of benchmarks is selected accordingly to strict rules. When needed, random selection of benchmarks is done accordingly to an initial random seed number composed with each judge random seed. Chance of selection of a benchmark depends on its hardness (in the applications category, this hardness is measured by the respective performances of Minisat, Picosat and Rsat on it (2007 contest version)). The entire process of selection can be enterily reproduced by following the same rules, with the same initial random number seed. Series of benchmarks (same generator, same BMC problem with different bounds, ...) are proposed by organizers, and technical advisor with the help of submitters. They are validated by judges, based on benchmarks descriptions only (not on benchmarks themselves).
Special tracks
Certified unsat answers
- Organizer
- Allen van Gelder
Important dates
- April 19, 2009
- Solvers submission deadline.
- Follow this link to register for this track
In the verified-unsat track of the 2009 Sat Solver Competition, solvers produce some verification output and are only tested on unsatisfiable benchmarks.
Because of the experimental nature of this track, there are no prizes and no official ranking system. See the poster shown at SAT 2007 for that year's track. We expect 2009 to be conducted similarly.
There are several accepted proof formats, each with precise specifications. These formats are designed to be neutral to the solver's methodology as far as possible (in contrast to several "trace" formats proposed in the literature that are specific to the grasp/chaff style).
All formats may be found at the ProofChecker web page , but we expect the most popular to be the RUP format, so that is described briefly in this page.
The purpose of the RUP proof format is to allow a solver to "retro-fit" some verification output without too much trouble, and to allow for some possibility of derivation rules other than resolution.
The short story is that solvers in the vein of grasp, chaff, berkmin, minisat, and picosat can simply output their conflict clauses (the final one being the empty clause).
Also, we believe that look-ahead solvers will be able to output a RUP proof without great difficulty. Notice that it is not necessary to actually use all the clauses you output, but if the redundancy is too great, you might hit filesize limits or time limits. Practical experience is needed to see what can be achieved.
The proof should be on a separate file and should consist of one 256-byte header (255 bytes plus the newline), followed by derived clauses in the usual DIMACS format.
Please visit the above URL for additional details and software to check your verification output during development.
The rest of this page goes into details in case there are solvers that use a different approach, or readers just want to see some examples.
The idea is based on
E.Goldberg, Y.Novikov. Verification of proofs of unsatisfiability for CNF formulas. Design, Automation and Test in Europe. March 2003, pp. 886--891. https://eigold.tripod.com/papers.html.The method of deriving the derived clause is not important. It might be by resolution or some other means. It is only necessary that the soundness of the derived clauses can be verified by a procedure that we call REVERSE UNIT-PROPAGATION (RUP for short). In the discussion, clauses are delimited by square brackets.
Suppose that F is the input formula and D_1, ..., D_r are the derived clauses that have been output, with D_r = [] (the empty clause). The sequence D_1, ..., D_r is an RUP refutation of F if and only if:
For each i from 1 through r, steps 1--3 below derive []:
1. Negate D_i = [L_{ij}] producing one or more unit clauses, [-L_{ij}].
For example, the negation of [x, y, z] is [-x], [-y], [-z].
(When i = r, there are no unit clauses produced.)
2. Add these unit clauses [-L_{ij}] and D_1 through D_{i-1} to F.
(When i = r, there are no unit clauses to add.)
3. Perform unit-clause propagation.
EXAMPLES
Say the input file F has these clauses:
c example
p cnf 4 4
1 -4 -3 0
1 4 0
-1 0
-4 3 0
EXAMPLE 1
The RUP proof might be these integers (newlines are for readability):
4 3 0 0
This proof contains two derived clauses, [4 3] and []. Negating [4 3] gives [-4] and [-3], which are added to F. Then unit-clause propagation derives []. Next, add [4 3] (the first derived clause) to F, and again unit-clause propagation derives []. So the proof is verified as sound, although we have no idea why the program output [4 3] as a derived clause.
EXAMPLE 2
Another correct RUP proof for F is:
0
This is because the formula itself can be refuted by unit-clause propagation.
EXAMPLE 3
Another correct RUP proof for F is:
1 -3 0 1 3 0 -3 0 3 0 0
This is the sequence of resolvents in a resolution proof, but no structure is given, so that fact might not be obvious.
Minisat hack track
- Organizers
- Daniel Le Berre and Laurent Simon
- Technical Advisor
- Olivier Roussel
- Judges
- Andreas Goerdt, Ines Lynce and Aaron Stump
Important dates
- March 1st, 2009
- Solvers and benchmarks submissions for the contest are open.
- Follow this link to register for the competition
- March 22, 2009 (UPDATED)
- Solvers and benchmarks submission deadline.
- June 30 - July 3, 2009
- Results disclosed
Minisat is known for a long time now and it is thus a good candidate for our special track: it can easily be seen as a canonical CDCL solver that offers an efficient (and necessary) CDCL basis. It was proven that improvements can be obtained by minor changes ("hacking it"). For instance, if we look back at recent progresses in CDCL solvers, most of them are based on minor changes (<10 lines each): phase caching, blocked literals, and luby-series for rapid restarts.
Fixing in advance the amount of possible changes will certainly help the community to detect which techniques pay off. In order to analyze experimental results in a more systematic and scientific way, it is necessary to isolate which causes make which consequences, and to assert this by experimental evidences.
Depending on the number of submissions in this track, organizers may decide to use a shorter time limit or a smaller set of benchmarks than in the main competition. Prices will not be given in this track, but we hope the whole community to benefit from it. The two best solvers identified in this track will enter the main competition to be compared on the same basis as other solvers. We strongly encourage usual participants to the main track and non-students to participate.
Many thanks to Niklas Een and Niklas Sorensson for allowing us proposing this special track.
Rules at a glance:
- The version of minisat we'll use for this special track is the 2007 contest version, available here (minisat2-070721.zip)
- Benchmarks will be a selection of pre-"satelited" benchmarks of the 2009 contest, in the applications category. Only the "core" version will be considered (benchmarks will be already simplified).
- The number of lines in minisat is roughly 2500 lines (including libraries in "mtl" directory, and considering only the "core" version). Thus we'll accept up to 125 modified lines (5% of syntactical modifications). Lines must be readable, and classical C lines (in the style of the original minisat). If any solver contains more than 125 modified lines, or too many exotic C lines, then organizers are free to reject submissions.
Multithread track
- Organizers
- Daniel Le Berre and Laurent Simon
- Technical Advisor
- Olivier Roussel
- Judges
- Andreas Goerdt, Ines Lynce and Aaron Stump
Important dates
- March 1st, 2009
- Solvers and benchmarks submissions for the contest are open.
- Follow this link to register for the competition
- March 22, 2009 (UPDATED)
- Solvers and benchmarks submission deadline.
- June 30 - July 3, 2009
- Results disclosed
Multithreading is everywhere except in our solvers. This observation contrasts with nowadays computers that contain frequently 4 cores, and up to 32 for workstations. However, efficient multithreading (parallelization on the same computer) is still a challenge for the SAT community. SAT solvers are very memory consuming and simple parallelization is very weak, because memory is a hard bottleneck, and it is not possible to overload it. In this context, it is striking to notice that state of the art multithreading solvers are still very simple (for instance, running the same solver with different parameters) despite the real interest of the industry in this specialization.
Multithreaded solvers entering this track are required to take as
parameter the maximum number of threads or processes that they are
allowed to use (MAXNBTHREAD). If time permits, this parameter will be
used to determine how solvers scale with the number of CPU. We
are planning
to use at least 4 cores, and hoping to use a 16 cores computer with
68GB memory for a limited amount of time. We are especially interested
in solvers that scale very well with the number of available
cores.
It is important to emphasize that solvers are expected to be
determinists, that is to say to have the same behaviour (especially
approximately the same CPU time) when run several times on the
same instance. Even if determinism is harder to achieve for
multiprocesses solvers, a typical user will only run his solver once on
an instance. Therefore, solvers will be run only once on a given
instance.
Preprocessors track
- Organizers
- Daniel Le Berre, Olivier Roussel and Laurent Simon
- Judges
- Andreas Goerdt, Ines Lynce and Aaron Stump
Important dates
- March 1st, 2009
- Solvers and benchmarks submissions for the contest are open.
- Follow this link to register for the competition
- March 22, 2009 (UPDATED)
- Solvers and benchmarks submission deadline.
- June 30 - July 3, 2009
- Results disclosed
Efficient preprocessing was one of the most important advance in the last decade. This phase is essential to limit the number of clauses and variables of industrial benchmarks. However, progresses are still awaited, and a lot of questions are still open. For instance, a simple shuffling of the instance may enhance solver performances (in very rare cases, however. Shuffling is not the best way to preprocess an instance!). This track is intended as a very first and imperfect way of evaluating preprocessors. Evaluation will be based on the comparison of the time needed by a reference solver to solve a given instance, versus the time needed by the preprocessor and the same reference solver to solve the same instance. The reference solver that will be used in this special track is the 2007 contest version of minisat, available here (minisat2-070721.zip)
We ask preprocessors to preserve the satisfiability of the original formula. Preprocessors are not required to generate equivalent formulas but they must be able to extend any model found by a solver to a model of the original formula (UNSAT certificates will not be considered). The preprocessor is asked to generate a CNF file that respect the standards of the contest (no holes in variables number for instance). Then, the 2007 contest version of minisat (minisat2-070721.zip) will be run. At last, contestants will have to analyze minisat answer to generate an answer and a model (if any) of the initial CNF file.
Submitters will have to submit a program or a script that
- generate in the TMPDIR directory a preprocessed CNF file
- run the core version of minisat2-070721 (executable provided by
the organizers) with the name of the preprocessed CNF file as first
parameter and the name of the file where minisat should store the
solution as second parameter. To run minisat, you must strictly conform
to the following model:
minisat $TMPDIR/your_own_preprocessed_file.cnf $TMPDIR/your_own_solution_file
The executable of the core version of minisat2-070721 will be in the PATH of the evaluation environment during this track.
- examine the status code of minisat and print the answer following the same rules as the main track (see solvers requirements). When the instance is satisfiable, the model given on the "v " line MUST be a model of the initial CNF file (instead of a model of the preprocessed file). To do this, your preprocessor might have to generate files which maps variables of the initial CNF to the variables of the preprocessed CNF, possibly with some other files too. In any case, all files must be generated in the TMPDIR directory.
#!/bin/cshThe SatELite script (https://www.minisat.se/SatELite.html) is another (more involved) example.
preprocessor $1 $TMPDIR/tmp.cnf
minisat $TMPDIR/tmp.cnf $TMPDIR/sol # this line is imposed (but name of files can be changed)
switch($status)
case 10:
echo 's SATISFIABLE'
generateModel $TMPDIR/sol # this program must output a model of $1
breaksw
case 20:
echo 's UNSATISFIABLE'
breaksw
default:
echo 's UNKNOWN'
breaksw
endsw
Non clausal track
- Organizers
- Armin Biere, Edmund Clarke, John Franco and Sean Weaver
The performance of CNF Satisfiability solvers has improved by several orders of magnitude in recent years. This is largely due to new technologies which have led to the replacement of tree search by DAG search, improved breadth-first and depth-first look-ahead strategies, search space reduction by adding uninferred yet safe constraints, and many more. Thus, SAT solvers now have a good measure of control over the search space. As a result it has become desirable in the case of numerous problems to translate instances to Satisfiability where they are solved more efficiently than with other techniques. In fact it has been stated by some people in industry that SAT ranks alongside Constraint Programming and Mixed Integer Programming as the three principal methods for solving combinatorial problems.
But there are still quite a few problems that are hard for CNF solvers. There are several reasons for this, some of which are:
- Important domain-specific information is garbled when converted to CNF. This may burden a SAT solver unnecessarily.
- Problem constraints may be naturally difficult for CNF solvers. For example, a system of exclusive-or constraints is difficult for CNF solvers although these are essentially counting problems and pose little difficulty for other approaches including algebraic methods.
- Some problems can be more concisely expressed in another domain.
- The nature of the problem prevents meaningful learning until deep in the search.
In recent years several branches in SAT research have explored non-clausal solutions to these problems. Some examples are Groebner bases, psuedo-boolean constraints, and Satisfiability Module Theories. But other ideas, generally inspired by specific problem domains, such as AIG and Bounded Model Checking have resulted in SAT solvers that take a different tack. To encourage development along these lines we are organizing a SAT solver competition in 2009 where all input instances are non-clausal: they come directly from actually encountered problem domains and are not translations from another form. All solvers are invited to participate but, for example, CNF solvers must be able to convert from the input language specified below to CNF on their own.
The competition will be run on a Beowulf cluster of 128 nodes housed in the Department of Computer Science at the University of Cincinnati. There will be three judges. Scoring will follow that of the main SAT competition.
There will be several competition tracks, where each accepts a particular input language. The reason for this is that multiple tracks/languages will provide multiple possibilities for expressing problems natively, without the need for translations to CNF or other form. One track will use the input language AIGER which is well-developed and which has been used in past competitions. This will allow last year's AIG solvers and new solvers to participate without investing significant effort in parsing and type checking new non-clausal input languages. Utilities for parsing AIG formulas already exist. Many problems in bounded model checking and k-induction are in form of Boolean circuits that can be represented as AIG formulas. A higher-level language will allow collections of arbitrary Boolean functions and special constraints to be input. Other tracks will likely be formulated soon. Details will be distributed when they have been worked out and agreed upon by the organizers.
Last updated: $LastChangedDate: 2008-12-29 10:57:14 +0100 (Mon, 29 Dec 2008) $