* Make `dev` a property of `Allocator`
(this is a prereq refactor for #10285)
At least `BufferXfer.copy` accesses it assuming it's always present,
currently most devices just add this property on their own repeating
the same code over and over again.
This is also a bit footguny, see `RemoteAllocator` that named this
property `device` instead of `dev`, i could obviously just change that
in one place but doing it globally seems like a better solution (and it
reduces code duplication too).
`MallocAllocator` is a bit special, but passing `None` works just fine.
* typing
* ignore type instead of cast
Should be a small speed improvement but the main reason this is needed
is to have a defined ordering of RemoteRequests within one host so that
transfers won't required doing something like:
```python
src_dev.batch_submit()
dest_dev.q(Transfer(dest, src_dev.session, src))
dest_dev.batch_submit()
```
for correctness.
Not strictly required for anything but soon there will be like 4 new
properties and having it be a huge json just seems like a bad taste.
It also seems right to not have a separate endpoint for this, just
`GetProperties` request that returns a repr of this similar to how
requests are sent in `BatchRequest`.
This will also make a switch to anything other than http much simpler
if it will be required for any reason, like just a tcp stream of
`BatchRequest`s
* Basic remote multi support
Simplest thing to be able to use remote with multiple gpus, very slow
because no transfers (copyin copyout for cross-device copies)
* tests
This is a prereq refactor for cloud multi which will make it possible to
use multiple devices from cloud host instead of just one.
I will do that via changing a session to be a `tuple[token, dev_idx]`
Previously the session was in cookies, this is a problem because a single
http request can contain many RemoteRequests with potentially different
devices.
The alternatives are either:
\- sending commands for different devices in separate http requests (slow)
\- only adding an idx in RemoteRequest in basically the same way i added
session here, keeping session a cookie and concat in server. This is how
i've done it previously and it looks just strictly worse than having it
all be in the same place.
* work on minrf example
* more
* jit sample
* t is tensor not const
* fixes
* more convs
* fix dropout
* don't print
* 504
* big patch
* onehot
* touch
* use embeddings
* dumb uses final layer
* act
* non fl
* match
* tp
* 3
* of
* ppsz
* normal
* add adln
* no t
* weird transformer
* weird transformer
* contig
* actual speed fix
* dumb
* cb
* 0
* t is 0
* mort-t
* args
* dumb days are over
* readable
* contig
* no more t mask
* mask_t
* init to zero
* clean
* steps
* work
* tt
* t
* solid
* even spacing in viz nodes
* precise dy value
* dominant-baseline text-after-edge
* add STROKE_WIDTH constant, delete dominant_baseline attr
---------
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>