39 Commits

Author SHA1 Message Date
Sorawee Porncharoenwase
d96a1e0e52 core: change the variable representation 2024-03-14 05:04:03 +07:00
Sorawee Porncharoenwase
23dcab4798 fix: fix performance in basis2 lemma 2024-03-14 05:04:03 +07:00
Sorawee Porncharoenwase
b4b2e265eb fix: fix performance issue in the inference 2024-03-14 05:04:03 +07:00
Sorawee Porncharoenwase
cfb4a1b42a refactor: hoist range inference so that it's run only once 2024-03-14 05:04:03 +07:00
sorawee
4d7e42cf50 feat: log inferred known variable count (#43) 2024-03-14 05:04:03 +07:00
Sorawee Porncharoenwase
a35578f069 refactor: make reader more modular in preparation for gnark support 2024-03-14 05:04:03 +07:00
sorawee
26a02230d3 chore: refactored the inclusion of definitions related to the prime. (#22)
Previously, we added them during a constraint optimization pass as
a part of the AST traversal. This is not ideal for many reasons.
First, we only want to include the definitons once,
but the AST traversal opens an opportunity to accidentally include them
more than once (e.g. if we happen to have a rcmds as a subnode of another node).
Second, it meant that we needed the pdef? parameter so that
on alternative constraints, the prime definitions are omitted.
The refactoring simply put the definition inclusion at the top-level,
simplifying the optimization pass.
2024-03-14 05:04:03 +07:00
sorawee
65d817a134 chore: refactor solver interfaces to use Racket class/object system (#21) 2024-03-14 05:04:03 +07:00
sorawee
3c246d728e chore: refactor selector to use Racket class/object system (#20) 2024-03-14 05:04:03 +07:00
Sorawee Porncharoenwase
ec6b6842cc chore: refactor constraints 2024-03-14 05:04:03 +07:00
sorawee
d588d87f39 cnst gen: always generate full xvec and yvec (#17)
Prior this commit, yvec will reuse input variables from xvec
which is sound because we assume that they are equal anyway.
However, this reusing adds complexity to Picus.
This commit removes the reusing, simplifying the codebase.
The removal should also not affect performance,
since SMT solvers are very good at dealing with equalities.
This also allows modular verification to be programmed more
straightforwardly.
2024-03-14 05:04:03 +07:00
Sorawee Porncharoenwase
3b012049d8 logging: omit excessive logging 2024-03-14 05:04:03 +07:00
Sorawee Porncharoenwase
f72939fced fix: fix the format of accounting logging 2024-03-14 05:04:03 +07:00
sorawee
695acd8cb6 logging: more debugging info for linear lemma (#62) 2023-10-19 09:21:30 -05:00
sorawee
3696b54d4e feat: add accounting logging (#61) 2023-10-17 06:39:24 -05:00
Sorawee Porncharoenwase
4ee45bc00a feat: support wtns generation
The commit adds --wtns, which indicates the output directory for the
witness files. If not given, no witness file is generated.
The witness files are called `first-witness.wtns` and
`second-witness.wtns`.
2023-10-06 22:56:02 +07:00
sorawee
3a801645ba feat: switch Picus to use the SaaS framework (#47) 2023-10-03 10:44:26 -05:00
Sorawee Porncharoenwase
eb2e8cda80 chore: adjust behavior of no verbose
Make --verbose 0 not output algorithm computation.
Also remove the leading "#" in the output
2023-09-26 06:53:47 +07:00
sorawee
8a71edc19d chore: put files under picus subdirectory (#35)
Reduce files cluttering the project top-level
2023-09-21 13:09:05 -05:00
sorawee
2b28e47055 fix: handle interrupt correctly (#30)
This commit fixes the problem where an interrupt will not kill a
solver spawn by Picus, causing many solvers to be run concurrently in
the background. Since the fix needs to be applied to three files
that communicate with cvc5, Z3, and cvc4, I also took an opportunity to
refactor them and unify them into one common file.
2023-09-01 14:01:15 -05:00
Sorawee Porncharoenwase
37605c0d76 fix: make printer for query generation more efficient
We should avoid building the resulting query string with `format`
repeatedly, because `format` creates an entirely new string in
each invocation. When the query is big, the cost is
prohibitively expensive.

This commit instead builds the query string by writing to a
string port directly (similar to the StringBuilder idiom in languages
like Java). As the fix needs to be done on Z3, cvc4, and cvc5, I also
take an opportunity to refactor them to reduce code redundancy.
2023-08-30 14:37:09 -07:00
Sorawee Porncharoenwase
d9ae5ed3d9 fix: make query generation more efficient
Use vector indexing instead of list indexing.
This avoids cost to list traversal, which is significant in a hot loop.
2023-08-30 14:29:59 -07:00
sorawee
4d6e48bcc7 fix: make binary01-lemma more efficient (#22)
Switch list-ref to direct element iteration, reducing quadratic running time to linear.
2023-08-30 14:40:35 -05:00
sorawee
8515124cdf fix: make aboz lemma more efficient (#24)
Instead of using `rcmds-ref` in a loop, which takes quadratic time,
directly iterate over three consecutive elements simultaneously,
making it linear time traversal.
This change makes the large benchmark not stuck at aboz lemma.
2023-08-30 14:38:15 -05:00
sorawee
9505e67158 fix: make bim lemma more efficient (#25)
This is done by switching list-ref in a loop to direct iteration.
Also fix a typo that labels the pass from "aboz" to "bim".
2023-08-30 14:37:43 -05:00
Sorawee Porncharoenwase
e5c3b09e73 fix: improve performance of basis2-lemma
Previously, we use `(list->set (set->list ks))` to copy the set `ks`,
but this is in fact not needed at all. `ks`, which is an immutable set,
can't be structurally mutated (although its binding can be mutated).

The copying is especially harmful when it's done in a hot loop
(the `update` function in particular), since known and unknown set could
be very large.

This change makes the large benchmark not stuck at basis2-lemma
2023-08-30 10:13:00 -07:00
sorawee
186d62db39 fix: reimplement linear lemma to make it more efficient (#18)
Prior this commit, linear lemma depends on the CDMAP and RCDMAP
procedures.

Let's say that constraint i involves variables Xi and Yi, where Xi variables
are the linear part, and Yi variables are the non-linear parts.
That is to say, the input follows this format:

  Constraint 1: X11 X12 ... Y11 Y12 ...
  Constraint 2: X21 X22 ... Y21 Y22 ...
  ...
  Constraint n: Xn1 Xn2 ... Yn1 Yn2 ...

which is of size O(sum |Xi| + sum |Yi|).

The CDMAP procedure produces a map from a variable to a list of list of
variables. Meaning: a key can be uniquely determined if one of the list
of variables are all uniquely determined.

Thus, CDMAP produces a data of size O(sum { |Xi| (|Xi| + |Yi|) }) as the
output (where values of the same key are grouped together)

  X11: X12 X13 ... Y11 Y12 ...
  X12: X11 X13 ... Y11 Y12 ...
  ...
  X1k: X11 X12 ... Y11 Y12 ...

  X21: X22 X23 ... Y21 Y22 ...
  X22: X21 X23 ... Y21 Y22 ...
  ...

  .....

  Xn1: Xn2 Xn3 ... Yn1 Yn2 ...
  Xn2: Xn1 Xn3 ... Yn1 Yn2 ...
  ...
  Xnk: Xn1 Xn2 ... Yn1 Yn2 ...

Often, some clauses might have O(n) variables, making the size
O(n^3) if the table is dense, or O(n^2) if the table is sparse.

The RCDMAP procedure essentially consumes a map that CDMAP procedure
outputs, and produces the inverse of that map, with duplicate keys of
the input map grouped together. The size is thus the same.

RCDMAP is also used in the counter selector. E.g., given:

  1 2 3 | 4 6
  1 3 5 | 4 6

Then CDMAP produces:

  1: [2 3 4 6, 3 5 4 6]
  2: [1 3 4 6]
  3: [1 2 4 6, 1 5 4 6]
  5: [1 3 4 6]

And RCDMAP produces:

  1 3 4 6: [2, 5]
  2 3 4 6: [1]
  3 5 4 6: [1]
  1 2 4 6: [3]
  1 5 4 6: [3]

The counter selector counts that 5 occurs 2 times in the keys,
so it is weighted 2. 6, by contrast, occurs 5 times, so it is weighted 5.

-------------------------------------------------------------------

This commit aims to improve the performance by avoiding blowing up the
data size beyond linear. It also aims to preserve the existing behavior
as much as possible.

This is done by removing the CDMAP and RCDMAP procedures,
and instead employing the straightforward encoding:

  X11 X12 ... | Y11 Y12 ...
  X21 X22 ... | Y21 Y22 ...
  ...
  Xn1 Xn2 ... | Yn1 Yn2 ...

which has the linear size.

When a variable is uniquely determined, we simply cross it off from the
above table. A variable becomes propagable when it's the only one left
in the linear part, and the non-linear part is empty.

For counter selector, we produce the counter without expanding the
encoding beyond linear size. This can be done by simple multiplication:
in (Xi, Yi), each variable in Xi contributes to other variables in Xi
exactly |Xi|-1 times (not counting itself) and each variable in Yi
contributes to other variables in Xi exactly |Xi| times.

The results are not the same, however. Consider the earlier concrete example:

Our encoding:

  1 2 3 | 4 6
  1 3 5 | 4 6

RCDMAP:
  1 3 4 6: [2, 5]
  2 3 4 6: [1]
  3 5 4 6: [1]
  1 2 4 6: [3]
  1 5 4 6: [3]

The weights of 5 in both our encoding and RCDMAP agree: 2.
However, the weights of 6 do not agree.

  Our encoding produces weight 6.
  RCDMAP produces weight 5.

Arguably, our weight is more "correct", because it takes the value
part into account. I.e., in "1 3 4 6: [2, 5]", 6 contributes twice
(2 once and 5 once), so it should be counted twice.

Separately, I also believe that the counter selector is not ideal,
and plan to revamp the strategy soon, so the discrepancies will not
matter anyway.
2023-08-30 12:03:28 -05:00
sorawee
ea0cb6a4e4 fix: compute unknown set correctly (#20)
In basis2 lemma, set-remove (which removes a single element) is used
instead of set-subtract (which removes a set of elements).
Previously, Picus gets away with this incorrect result because linear
lemma will "clean up" afterward by syncing known set and unknown set.
However, in the PR #18, we no longer do this syncing,
thus uncovering this issue. This commit fixes the problem.
2023-08-30 11:16:01 -05:00
Sorawee Porncharoenwase
d570e92a60 chore: rename xlist to varlist, general cleanup 2023-08-29 13:01:37 -07:00
sorawee
dac94283f7 fix: make known set computation more efficient (#15)
Instead of using list-ref in a loop (which takes quadratic time)
and list membership in a loop (which also takes quadratic time),
switch to direct element iteration and set membership query.
This reduces the running time from >10mins (I haven't fully measured it.
Terminated it early because I don't want to wait) to 0.6s.
2023-08-29 12:30:13 -05:00
sorawee
ff64bd9220 feat: support --verbose and --cex-verbose (#14)
This commit renames the existing --verbose to --cex-verbose,
and makes --verbose control the non-counterexample output,
which could be overwhelming on a very large circuit.
Similar to --cex-verbose, there are three levels for --verbose.
- 0: hides "large" outputs entirely.
- 1: shows the output, but display ... when the output is too large
- 2: shows the full output.
2023-08-28 21:27:28 -05:00
sorawee
f3a87e1523 fix: read prime from circuit (#10)
We previously hard coded the prime number and the log of the prime number.
This commit instead reads the prime number from the circuit.
2023-08-25 17:22:01 -05:00
sorawee
5d902c6f47 Restructure counterexample output format for readability (#5)
This commit adds a flag `--verbose` for verbose level,
which supplants the flag `--raw-output` (or, previously, `--map`).
- When the verbose level is 0 (not verbose),
  the output is in the circom variable format.
- When the verbose level is 1, the output is a mixed between the circom
  variable format and the r1cs signal format, where we prefer the circom
  variable format whenever possible.
- When the verbose level is 2, the output is always in
  the r1cs signal format.

For `--verbose 0`, the output now has three sections: inputs, first
possible outputs, and second possible outputs.

For other verbosity level, there could be four sections. The extra
section is "other bindings".

Entries that are different in the first possible outputs and
second possible outputs are further highlighted with ANSI escape sequence.

Examples:

With `--verbose 0`:

```
  # inputs:
    # m1.main.inp: 0
  # first possible outputs:
    # m1.main.out[0]: 0
    # m1.main.out[1]: 0
    # m1.main.success: 0
  # second possible outputs:
    # m2.main.out[0]: 1
    # m2.main.out[1]: 0
    # m2.main.success: 1
```

With `--verbose 1`:

```
  # inputs:
    # m1.main.inp: 0
  # first possible outputs:
    # m1.main.out[0]: 0
    # m1.main.out[1]: 0
    # m1.main.success: 0
  # second possible outputs:
    # m2.main.out[0]: 1
    # m2.main.out[1]: 0
    # m2.main.success: 1
  # other bindings:
    # one: 1
    # p: 21888242871839275222246405745257275088548364400416034343698204186575808495617
    # ps1: 21888242871839275222246405745257275088548364400416034343698204186575808495616
    # ps2: 21888242871839275222246405745257275088548364400416034343698204186575808495615
    # ps3: 21888242871839275222246405745257275088548364400416034343698204186575808495614
    # ps4: 21888242871839275222246405745257275088548364400416034343698204186575808495613
    # ps5: 21888242871839275222246405745257275088548364400416034343698204186575808495612
    # zero: 0
```

With `--verbose 2`

```
  # inputs:
    # x4: 0
  # first possible outputs:
    # x1: 0
    # x2: 0
    # x3: 0
  # second possible outputs:
    # y1: 1
    # y2: 0
    # y3: 1
  # other bindings:
    # one: 1
    # p: 21888242871839275222246405745257275088548364400416034343698204186575808495617
    # ps1: 21888242871839275222246405745257275088548364400416034343698204186575808495616
    # ps2: 21888242871839275222246405745257275088548364400416034343698204186575808495615
    # ps3: 21888242871839275222246405745257275088548364400416034343698204186575808495614
    # ps4: 21888242871839275222246405745257275088548364400416034343698204186575808495613
    # ps5: 21888242871839275222246405745257275088548364400416034343698204186575808495612
    # zero: 0
```
2023-08-22 15:08:36 -05:00
Shankara Pailoor
1fca1b5e98 BUG FIX: basis2 lemma only valid for bit sizes < 254. 2023-08-18 10:21:52 -05:00
Yanju Chen
088e4b8aa0 sync with latest research artifact 2023-08-16 16:56:02 -07:00
chyanju
60848c9720 separate unknown/unsafe results 2022-10-10 17:29:29 -07:00
chyanju
ff6d1535a6 update entrypoint and cexp support 2022-10-09 17:36:24 -07:00
chyanju
6c94666d9b clean up 2022-09-11 10:18:20 -07:00
chyanju
a17d21a5fc major refactoring
- sort out algorithm structure, deep refactoring still required
- fixed z3 errors
- other small fixes
2022-09-09 18:58:20 -07:00