* preliminary test
* missed Optional
* don't check for cache during recursion
* match style from st_fixup... may be marginally faster?
* pathological test case: strongly connected DAG
* move to test_schedule as this isn't really a fusion
* oops this shouldn't be edited
* Revert "oops this shouldn't be edited"
This reverts commit 487cb027dc.
* Revert "move to test_schedule as this isn't really a fusion"
This reverts commit 48d8c550ce.
* move to test_schedule as this isn't really a fusion
* ok no more merge error funny business
* generate new kernel dataset
pre req to remove NumNode
```
extra/optimization/generate_dataset.sh
gzip -k /tmp/sops
mv /tmp/sops.gz extra/datasets/
```
* fix var range in fuzz_linearizer
* remove pyint
* bump time on tp [pr]
* dont truncate in const fold
* remove dead code
* Revert "dont truncate in const fold"
This reverts commit 29c81db0f7.
* remove define_var
* add types in batchnorm class
* fix lint error in batchnorm types
* add types to conv1d function
* add types to convtranspose1d func and conv2d, convtranspose2d classes
* add types to all remaining classes
* change conv1d padding type to also accept str
* less is more; only keep non-obvious types
* mkdocs need types
* add support for single el tensors for slices
* rm trailing spaces
* cleanup long lines
* remove tensor in slice support, add comprehensive err msg
* cleanup getitem, add slice type check
* Edit err message
* ops_cuda: add optional dynamic smem parameter
This is required to enable larger than 48kb shared memory usage on
a per-kernel basis.
* move setting max dynamic smem size to init
* add support for padding='same' in nn.conv
* express concisely
* simplify loop
* test same padding with dilation and conv1d
* fix bad indentation
* make loop one liner
addressed #6935
the first few terms in fold_unrolled_divs might have been folded already, so the check should first try to add those terms back. there is a case that every but one term is folded which is not an add chain anymore, so just added as a failed test case for now