Summary of Spar: Self-play with Tree-search Refinement to Improve Instruction-following in Large Language Models, by Jiale Cheng et al.
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Modelsby Jiale Cheng, Xiao…