Commit Graph

160 Commits

Author SHA1 Message Date
sorawee
daa20d1676 Use temporary directory more effectively (#34)
* put smt2 files in temp dir

Now that we have a proper per-run temp directory, we can put
generated smt2 files in the temp directory, so that the files do not
clutter the main temp directory, and so that the files will be
cleaned up when Picus finishes execution.

* feat: add --noclean to prevent cleaning

Picus initially never clean up temporary files. However, the last commit
changed this so that temporary files are cleaned up. Sometimes, though,
we really do not want to clean up files (e.g. in order to generate
smt2 files for debugging purposes). This commit adds the flag to
control whether to run the clean up.
2023-09-20 23:12:53 -05:00
Sorawee Porncharoenwase
0af4b4dd9b feat: support optimization level for circom compilation 2023-09-16 02:40:49 +07:00
Sorawee Porncharoenwase
6bc51990c9 feat: support processing circom file directly
This commit adds a capability to process circom files directly
via the `--circom` flag. This is done by creating a
temporary directory to store the compiled r1cs file.
The temporary directory is cleaned up on exit when Picus exits normally.
However, when Picus crashes or when it is interrupted,
the temporary directory will not be cleaned, giving the user an
opportunity to inspect files.
2023-09-16 02:40:49 +07:00
sorawee
affb4cc3e8 ci: support timing out and optimization flag (#31)
Prior this commit, we cannot write tests expecting a timeout.
This commit adds the ability to do so.

Another major change is to support tests with optimization flag.
When Circom is given -O2, Circom files without public inputs
(which are all of our benchmarks) will fail to be compiled correctly.
This commit adds a heustistic to detect such situation,
and then "patch" the Circom file to add public inputs.
This is done by doing Circom compilation twice.
The first compilation allows us to read information from R1CS file,
which is then used for patching. The second compilation compiles
the patched file.
2023-09-07 18:57:22 -05:00
sorawee
f696a052a2 fix: speed up variable extractions (#32)
Prior this commit, the variables are repeatedly appended at the back of
a list, causing quadratic running time. Since ordering doesn't matter,
this commit makes it appends to the front instead.

Note that the algorithm overall is still potentially quadratic.
A proper fix needs to changes the interface significantly to introduce
an additional accumulator. This commit simply makes a minimal fix for
the immediate issue that we are having,
which causes CompConstant to get stuck.
2023-09-07 18:46:05 -05:00
sorawee
2b28e47055 fix: handle interrupt correctly (#30)
This commit fixes the problem where an interrupt will not kill a
solver spawn by Picus, causing many solvers to be run concurrently in
the background. Since the fix needs to be applied to three files
that communicate with cvc5, Z3, and cvc4, I also took an opportunity to
refactor them and unify them into one common file.
2023-09-01 14:01:15 -05:00
Sorawee Porncharoenwase
0013e5c0ce ci: improve the output
The markup ##[...] makes GHA groups outputs together, making it easy to
compare results across runs.
2023-08-31 08:28:21 -07:00
Sorawee Porncharoenwase
37605c0d76 fix: make printer for query generation more efficient
We should avoid building the resulting query string with `format`
repeatedly, because `format` creates an entirely new string in
each invocation. When the query is big, the cost is
prohibitively expensive.

This commit instead builds the query string by writing to a
string port directly (similar to the StringBuilder idiom in languages
like Java). As the fix needs to be done on Z3, cvc4, and cvc5, I also
take an opportunity to refactor them to reduce code redundancy.
2023-08-30 14:37:09 -07:00
Sorawee Porncharoenwase
d9ae5ed3d9 fix: make query generation more efficient
Use vector indexing instead of list indexing.
This avoids cost to list traversal, which is significant in a hot loop.
2023-08-30 14:29:59 -07:00
sorawee
4d6e48bcc7 fix: make binary01-lemma more efficient (#22)
Switch list-ref to direct element iteration, reducing quadratic running time to linear.
2023-08-30 14:40:35 -05:00
sorawee
8515124cdf fix: make aboz lemma more efficient (#24)
Instead of using `rcmds-ref` in a loop, which takes quadratic time,
directly iterate over three consecutive elements simultaneously,
making it linear time traversal.
This change makes the large benchmark not stuck at aboz lemma.
2023-08-30 14:38:15 -05:00
sorawee
9505e67158 fix: make bim lemma more efficient (#25)
This is done by switching list-ref in a loop to direct iteration.
Also fix a typo that labels the pass from "aboz" to "bim".
2023-08-30 14:37:43 -05:00
Yanju Chen
6cbdd78d71 Merge pull request #21 from sorawee/fix-basis2-lemma-efficiency
fix: improve performance of basis2-lemma
2023-08-30 10:52:31 -07:00
Sorawee Porncharoenwase
e5c3b09e73 fix: improve performance of basis2-lemma
Previously, we use `(list->set (set->list ks))` to copy the set `ks`,
but this is in fact not needed at all. `ks`, which is an immutable set,
can't be structurally mutated (although its binding can be mutated).

The copying is especially harmful when it's done in a hot loop
(the `update` function in particular), since known and unknown set could
be very large.

This change makes the large benchmark not stuck at basis2-lemma
2023-08-30 10:13:00 -07:00
sorawee
186d62db39 fix: reimplement linear lemma to make it more efficient (#18)
Prior this commit, linear lemma depends on the CDMAP and RCDMAP
procedures.

Let's say that constraint i involves variables Xi and Yi, where Xi variables
are the linear part, and Yi variables are the non-linear parts.
That is to say, the input follows this format:

  Constraint 1: X11 X12 ... Y11 Y12 ...
  Constraint 2: X21 X22 ... Y21 Y22 ...
  ...
  Constraint n: Xn1 Xn2 ... Yn1 Yn2 ...

which is of size O(sum |Xi| + sum |Yi|).

The CDMAP procedure produces a map from a variable to a list of list of
variables. Meaning: a key can be uniquely determined if one of the list
of variables are all uniquely determined.

Thus, CDMAP produces a data of size O(sum { |Xi| (|Xi| + |Yi|) }) as the
output (where values of the same key are grouped together)

  X11: X12 X13 ... Y11 Y12 ...
  X12: X11 X13 ... Y11 Y12 ...
  ...
  X1k: X11 X12 ... Y11 Y12 ...

  X21: X22 X23 ... Y21 Y22 ...
  X22: X21 X23 ... Y21 Y22 ...
  ...

  .....

  Xn1: Xn2 Xn3 ... Yn1 Yn2 ...
  Xn2: Xn1 Xn3 ... Yn1 Yn2 ...
  ...
  Xnk: Xn1 Xn2 ... Yn1 Yn2 ...

Often, some clauses might have O(n) variables, making the size
O(n^3) if the table is dense, or O(n^2) if the table is sparse.

The RCDMAP procedure essentially consumes a map that CDMAP procedure
outputs, and produces the inverse of that map, with duplicate keys of
the input map grouped together. The size is thus the same.

RCDMAP is also used in the counter selector. E.g., given:

  1 2 3 | 4 6
  1 3 5 | 4 6

Then CDMAP produces:

  1: [2 3 4 6, 3 5 4 6]
  2: [1 3 4 6]
  3: [1 2 4 6, 1 5 4 6]
  5: [1 3 4 6]

And RCDMAP produces:

  1 3 4 6: [2, 5]
  2 3 4 6: [1]
  3 5 4 6: [1]
  1 2 4 6: [3]
  1 5 4 6: [3]

The counter selector counts that 5 occurs 2 times in the keys,
so it is weighted 2. 6, by contrast, occurs 5 times, so it is weighted 5.

-------------------------------------------------------------------

This commit aims to improve the performance by avoiding blowing up the
data size beyond linear. It also aims to preserve the existing behavior
as much as possible.

This is done by removing the CDMAP and RCDMAP procedures,
and instead employing the straightforward encoding:

  X11 X12 ... | Y11 Y12 ...
  X21 X22 ... | Y21 Y22 ...
  ...
  Xn1 Xn2 ... | Yn1 Yn2 ...

which has the linear size.

When a variable is uniquely determined, we simply cross it off from the
above table. A variable becomes propagable when it's the only one left
in the linear part, and the non-linear part is empty.

For counter selector, we produce the counter without expanding the
encoding beyond linear size. This can be done by simple multiplication:
in (Xi, Yi), each variable in Xi contributes to other variables in Xi
exactly |Xi|-1 times (not counting itself) and each variable in Yi
contributes to other variables in Xi exactly |Xi| times.

The results are not the same, however. Consider the earlier concrete example:

Our encoding:

  1 2 3 | 4 6
  1 3 5 | 4 6

RCDMAP:
  1 3 4 6: [2, 5]
  2 3 4 6: [1]
  3 5 4 6: [1]
  1 2 4 6: [3]
  1 5 4 6: [3]

The weights of 5 in both our encoding and RCDMAP agree: 2.
However, the weights of 6 do not agree.

  Our encoding produces weight 6.
  RCDMAP produces weight 5.

Arguably, our weight is more "correct", because it takes the value
part into account. I.e., in "1 3 4 6: [2, 5]", 6 contributes twice
(2 once and 5 once), so it should be counted twice.

Separately, I also believe that the counter selector is not ideal,
and plan to revamp the strategy soon, so the discrepancies will not
matter anyway.
2023-08-30 12:03:28 -05:00
sorawee
ea0cb6a4e4 fix: compute unknown set correctly (#20)
In basis2 lemma, set-remove (which removes a single element) is used
instead of set-subtract (which removes a set of elements).
Previously, Picus gets away with this incorrect result because linear
lemma will "clean up" afterward by syncing known set and unknown set.
However, in the PR #18, we no longer do this syncing,
thus uncovering this issue. This commit fixes the problem.
2023-08-30 11:16:01 -05:00
sorawee
7c53b26685 ci: fix mistakes from wrong compilation flag + add even more tests (#19)
PR #16 incorrectly concluded the expectation of some tests because I ran
them with different optimization flag. The commit fixes the issue and
add all other applicable tests in circomlib. These tests take less than
5 minutes to run in total, so that's not too long to run as a part of CI.
2023-08-30 11:15:39 -05:00
Sorawee Porncharoenwase
d570e92a60 chore: rename xlist to varlist, general cleanup 2023-08-29 13:01:37 -07:00
sorawee
dac94283f7 fix: make known set computation more efficient (#15)
Instead of using list-ref in a loop (which takes quadratic time)
and list membership in a loop (which also takes quadratic time),
switch to direct element iteration and set membership query.
This reduces the running time from >10mins (I haven't fully measured it.
Terminated it early because I don't want to wait) to 0.6s.
2023-08-29 12:30:13 -05:00
sorawee
718cb17847 ci: add even more tests (#16)
Since I will be modifying the algorithm, I want to run more tests to
make sure I don't break anything.
2023-08-29 12:28:58 -05:00
sorawee
ff64bd9220 feat: support --verbose and --cex-verbose (#14)
This commit renames the existing --verbose to --cex-verbose,
and makes --verbose control the non-counterexample output,
which could be overwhelming on a very large circuit.
Similar to --cex-verbose, there are three levels for --verbose.
- 0: hides "large" outputs entirely.
- 1: shows the output, but display ... when the output is too large
- 2: shows the full output.
2023-08-28 21:27:28 -05:00
sorawee
98e7c737e9 fix: make constraint generation more efficient (#13)
* chore: remove dead code

* fix: make constraint generation more efficient

Prior to this commit, we use list as a data structure to keep inputs,
and then access them by indices, which is inefficient.
This commit fixes the problem by converting the list to vector
for vector indexing. On the performance benchmark, it reduces
the runnning time for constraint generation from 8 minutes to 1 second.
2023-08-28 18:39:35 -05:00
Sorawee Porncharoenwase
cefcb428db fix: make parser significantly more efficient
Prior this commit, `read-r1cs` copied bytes in the file over and over
again in a hot loop, causing the time complexity to be quadratic.
This commit switches to use index-based access in the hot loop instead,
resulting in a large performance improvement.
The benchmark file that accompanies this fix took 30 minutes to
successfully parse (according to @shankarapailoor).
This commit reduces the parsing time to 2 seconds.

It should be noted that not all bytes copying is avoided,
since bytes copying outside the hot loop, although not ideal,
does not really impact the performance.
2023-08-28 16:23:53 -07:00
Sorawee Porncharoenwase
16bec3fbcb chore: rename picus-dpvl-uniqueness.rkt to picus.rkt 2023-08-26 09:33:32 -07:00
sorawee
f3a87e1523 fix: read prime from circuit (#10)
We previously hard coded the prime number and the log of the prime number.
This commit instead reads the prime number from the circuit.
2023-08-25 17:22:01 -05:00
sorawee
9b3a867ff5 solvers: close ports properly (#9)
On benchmarks that are not quickly solved, I keep encountering the
error:

  subprocess: process creation failed
    system error: Too many open files; errno=24

This is because ports are not properly closed when the corresponding
process is killed. This commit fixes the problem.

In the long run, it would be better to restructure these modules to
share common code, so that the fix can be done at a single place.
2023-08-25 17:20:52 -05:00
sorawee
5d70478aef ci: run ci on pull_request and add basic tests (#8)
- Prior this commit, every job depends on `publish-docker`, which is only
  run on push. Therefore, all jobs are skipped. This commit fixes the
  issue by removing the dependency.

  - `publish-docker` is now also run as the last step, only when all tests passed.

  - All subsequent jobs are now run on veridise/picus:git-latest,
    so that we do not need to push first.

- The tests are slightly more sophisticated. Previously, it only ensures
  that there's no error. Now, we also check against expected
  output (underconstrained or not).

- Switch to use Racket and Rosette that are already installed in the
  image. It turns out that we need to set the environment variable
  `PLTADDONDIR` for this to work, because $HOME is overriden in containers
  in GHA (https://github.com/actions/runner/issues/863).

- Remove a job that only compiles circomlib.
  There's no point to do that.
2023-08-23 13:23:29 -05:00
sorawee
f4f6e17df2 cvc5: minor improvement (#7)
Add comments to explain regexp and add a check so that we get nice error
messages when cvc5 returns unexpected model.
2023-08-22 15:10:19 -05:00
sorawee
5d902c6f47 Restructure counterexample output format for readability (#5)
This commit adds a flag `--verbose` for verbose level,
which supplants the flag `--raw-output` (or, previously, `--map`).
- When the verbose level is 0 (not verbose),
  the output is in the circom variable format.
- When the verbose level is 1, the output is a mixed between the circom
  variable format and the r1cs signal format, where we prefer the circom
  variable format whenever possible.
- When the verbose level is 2, the output is always in
  the r1cs signal format.

For `--verbose 0`, the output now has three sections: inputs, first
possible outputs, and second possible outputs.

For other verbosity level, there could be four sections. The extra
section is "other bindings".

Entries that are different in the first possible outputs and
second possible outputs are further highlighted with ANSI escape sequence.

Examples:

With `--verbose 0`:

```
  # inputs:
    # m1.main.inp: 0
  # first possible outputs:
    # m1.main.out[0]: 0
    # m1.main.out[1]: 0
    # m1.main.success: 0
  # second possible outputs:
    # m2.main.out[0]: 1
    # m2.main.out[1]: 0
    # m2.main.success: 1
```

With `--verbose 1`:

```
  # inputs:
    # m1.main.inp: 0
  # first possible outputs:
    # m1.main.out[0]: 0
    # m1.main.out[1]: 0
    # m1.main.success: 0
  # second possible outputs:
    # m2.main.out[0]: 1
    # m2.main.out[1]: 0
    # m2.main.success: 1
  # other bindings:
    # one: 1
    # p: 21888242871839275222246405745257275088548364400416034343698204186575808495617
    # ps1: 21888242871839275222246405745257275088548364400416034343698204186575808495616
    # ps2: 21888242871839275222246405745257275088548364400416034343698204186575808495615
    # ps3: 21888242871839275222246405745257275088548364400416034343698204186575808495614
    # ps4: 21888242871839275222246405745257275088548364400416034343698204186575808495613
    # ps5: 21888242871839275222246405745257275088548364400416034343698204186575808495612
    # zero: 0
```

With `--verbose 2`

```
  # inputs:
    # x4: 0
  # first possible outputs:
    # x1: 0
    # x2: 0
    # x3: 0
  # second possible outputs:
    # y1: 1
    # y2: 0
    # y3: 1
  # other bindings:
    # one: 1
    # p: 21888242871839275222246405745257275088548364400416034343698204186575808495617
    # ps1: 21888242871839275222246405745257275088548364400416034343698204186575808495616
    # ps2: 21888242871839275222246405745257275088548364400416034343698204186575808495615
    # ps3: 21888242871839275222246405745257275088548364400416034343698204186575808495614
    # ps4: 21888242871839275222246405745257275088548364400416034343698204186575808495613
    # ps5: 21888242871839275222246405745257275088548364400416034343698204186575808495612
    # zero: 0
```
2023-08-22 15:08:36 -05:00
shankarapailoor
5934bf3b2f Merge pull request #6 from sorawee/ff-literal
cvc5: support finite field literal
2023-08-22 12:21:52 -05:00
Sorawee Porncharoenwase
266227e2b7 cvc5: support finite field literal
Since 368f3c3, cvc5 can produce a finite field literal, in the format:

  #f <value> m <mod-value>

See 368f3c3ed6

This commit adds a support for finite field literal.
It also refactors the existing model parsing to use S-expression reading
that is already built-in to Racket.
2023-08-21 21:42:32 -07:00
Sorawee Porncharoenwase
aa569bc26e readme: fix math mode not rendered correctly 2023-08-21 15:19:52 -07:00
shankarapailoor
65f0ebe2dc Merge pull request #3 from sorawee/change-output-format-default
Default the output format to circom variables
2023-08-21 13:18:20 -05:00
Sorawee Porncharoenwase
0e03815084 Default the output format to circom variables
Prior this commit, the counterexample output format is defaulted to r1cs
signals (although the help page incorrectly says otherwise).
This commit fixes the help page and at the same time switches
the default format to circom variables, since the format is more
readable for clients. The flag `--raw-output` can still be used to view
the r1cs signals, which could be useful for debugging.
2023-08-21 11:08:58 -07:00
Yanju Chen
cea6fb9a98 Merge pull request #6 from Veridise/basis2-fix
BUG FIX: basis2 lemma only valid for bit sizes < 254.
2023-08-18 09:41:53 -07:00
Yanju Chen
c120fd0437 Merge pull request #7 from obatirou/update-documentation
doc: results interpretation
2023-08-18 09:41:04 -07:00
oba
3b750eeea1 doc: results interpretation 2023-08-18 17:57:49 +02:00
Shankara Pailoor
1fca1b5e98 BUG FIX: basis2 lemma only valid for bit sizes < 254. 2023-08-18 10:21:52 -05:00
Yanju Chen
2f46e3e174 Update docker-image.yml 2023-08-16 17:36:02 -07:00
Yanju Chen
367cdc8a89 Update docker-image.yml 2023-08-16 17:30:05 -07:00
Yanju Chen
5d96a64d4a Create r1cs-z3-ab0-optimizer.rkt 2023-08-16 17:21:24 -07:00
Yanju Chen
a08dceb4cb Delete r1cs-z3-AB0-optimizer.rkt 2023-08-16 17:21:13 -07:00
Yanju Chen
44ea29ab89 update test workflow 2023-08-16 17:05:51 -07:00
Yanju Chen
088e4b8aa0 sync with latest research artifact 2023-08-16 16:56:02 -07:00
chyanju
871c3694a9 added circomlibex benchmarks 2022-10-13 22:05:39 -07:00
chyanju
60848c9720 separate unknown/unsafe results 2022-10-10 17:29:29 -07:00
chyanju
2bd5538885 update benchmarks and entrypoint 2022-10-09 18:34:30 -07:00
chyanju
ff6d1535a6 update entrypoint and cexp support 2022-10-09 17:36:24 -07:00
chyanju
3bd71a585f finalize github actions 2022-10-09 14:19:52 -07:00
chyanju
c382bfe4d8 added more github actions and small fix 2022-10-08 13:21:29 -07:00