start working on dpo for the datasets

This commit is contained in:
Alex O'Connell
2024-03-19 21:31:34 -04:00
parent b9d394f860
commit f1659893d7
2 changed files with 80 additions and 7 deletions

View File

@@ -1,5 +1,6 @@
# TODO
- [ ] setup github actions to build wheels that are optimized for RPIs
- [ ] setup github actions to build wheels that are optimized for RPIs??
- [ ] setup github actions to publish docker images for text-gen-webui addon
- [ ] detection/mitigation of too many entities being exposed & blowing out the context length
- [ ] areas/room support
- [ ] figure out DPO for refusals + fixing incorrect entity id
@@ -7,6 +8,8 @@
- add in context learning variables to sys prompt template
- add new options to setup process for setting prompt style + picking fine-tuned/ICL
- [ ] prime kv cache with current "state" so that requests are faster
- [ ] support fine-tuning with RoPE for longer contexts
- [ ] support config via yaml instead of configflow
- [x] ChatML format (actually need to add special tokens)
- [x] Vicuna dataset merge (yahma/alpaca-cleaned)
- [x] Phi-2 fine tuning