* add cumsum with n-dim inputs, over arbitrary axis + relevant tests
* increased rtol for cumsum test
* move test_cumsum into test_ops
* skip arange test for images as relies on cumsum
* Fix typo
* rewrite cumsum to work with images
* safetensors test
* safe_save
* load back with real safetensors
* bugfix in device name. add simple torch_load
* it works for llama, but it's slower...
* mmap
* no intermediate
* load mmaped
* readinto speed
* not ready yet
* revert that
* ConstantOfShape ONNX test fixed.
* removed redundant if statement
* value is optional and should default to a float32 tensor with value of 0
* fixed: default parameters are created at function definition, bad for mutable objects.
make it work out of the box for new users.
the default configuration of train_efficientnet is to use the smaller cifar
dataset. import datasets.imagenet tries to open imagenet_class_index.json
and will fail, unless user has already downloaded it.
* Added few missing return typehints for tensor.py
* added test for empty tensor for Tensor.numel()
* fixed missing numel call in test_numel
* small change in reshape shape condition check
* Merge from upstream
* add and reorganize test_slice_* tests
* refactor Tensor.__getitem__()
* preliminary tests for 1) 0D tensors and 2) varargs for Tensor.zeros and Tensor.ones
* always compare shapes of the numpy arrays obtained from tinygrad and torch tensors
* add more tests for 0D support
* remove test_tensor.test_slicing(). All slicing tests at test/test_ops.py
* add zero-dim support
* make test_end2end.py consistent with 0dim support
* add test for tensor with zero in shape
* don't simplify ones if shape is ()
* skip tests that need zero-size tensor support.
- zero-size tensor support not related to 0dim tensors.
* add tests for __getitem__() supporting strides >= 1
* refactor __getitem__: support for strides >= 1
* minor refactors and add comments to __getitem__
* add tests for slices with negative steps
* add support for slices with negative strides
* Added few missing return typehints for tensor.py
* added test for empty tensor for Tensor.numel()
* fixed missing numel call in test_numel
---------
Co-authored-by: deefi <dee7ine@gmail.com>
* Fix ONNX dropout and unify the implementation
* Use tensor rand method for dropout
* Change approach for RNG in ONNX Dropout
* Fix style
* Test legacy RNG seeding
* Remove the necessity for legacy RNG in Tensor class
* added metal int64 and some simple tests
* removed bool return type def
* typo in test
* also missing in clang and gpu runtimes
* switched order for opencl
* increased atol and removed new line in kernel prefix
* added kaiming_uniform init for conv2d and linear layers
* fix: set getattr
* up
* fix: set getattr
* fix comments
* better does not mean it is good
* more nonlinearities
* added test
checks the distribution of default relu option
* prettier
* fix kernel size
* edit distribution of returned tensor
* complete tests and fix fan_mode
* added higher dim test
* prettier test
* fix silly blank
* just leaky_relu mode
* default fan in and leaky relu
* update params
* fix test
* shorter
* generalize Tensor.uniform and adjust kaiming init
- added low and high parameters to Tensor.uniform function, so it can have a specific range (default is 0 to 1)
- adjusted return line of kaiming_uniform
* range from -1 to 1
* delete comment
* adjusted test_uniform
* fixed
* delete comment