* change runtime to be synchronous
* fix test runtime with the new interface
* fix arg
* fix eval
* fix missing config attribute
* fix plugins
* fix on_event by revert it back to async
* update upload_file endpoint
* fix argument to upload file
* remove unncessary async for eval;
fix evaluation run in parallel
* use asyncio to run controller for eval
* revert file upload
* truncate eval test result output
* aider-bench: add visualization to summarize script and readme
* added example cost and actions histogram images for readme
* moved dependencies to evaluation section
* added optional START_ID env flag to resume from that instance id
* prepare_dataset: fix comparisons by using instance id's as int
* aider bench complete_runtime: close runtime to close container
* added matrix display of instance id for logging
* fix typo in summarize_results.py saying summarise_results
* changed start_id to skip_num to skip rows from dataset (start_id wasn't supportable)
* doc changes about huggingface spaces to temporarily point back to OD
* add eval_ids arg to run specific instance id's; fix/extend README
* fix description in parser for --eval-ids
* fix test_arg_parser.py to account for added arg
* fix typo in README to say "summarize" instead of "summarise" for script