* device default * Device.DEFAULT * half max for cuda * CUDA_INCLUDE_PATH * closer to working * cuda fixups * Update ops_cuda.py