Summary: https://arxiv.org/pdf/2501.05707

Sources

Published as a conference paper at ICLR 2025 MULTIAGENT FINETUNING : S ELF IMPROVEMENT WITH DIVERSE

Summary of "Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains"

Authors and Publication

Authors: Vighnesh Subramaniam, Yilun Du, Joshua B. Tenenbaum, Antonio Torralba, Shuang Li, Igor Mordatch
Published: ICLR 2025
Link: Read the full paper

Abstract

The paper discusses a novel approach to enhance the performance of large language models (LLMs) through a method called multiagent finetuning. This method addresses the limitations of traditional self-improvement techniques, which often lead to diminishing returns after several iterations of training. Instead of finetuning a single model, the authors propose a system where multiple models, all derived from the same base, are independently specialized through interactions among themselves. This approach allows for greater diversity in reasoning and improved performance across multiple rounds of finetuning.

Key Contributions

Multiagent Interaction: The paper introduces the concept of using a society of language models to facilitate self-improvement.
Specialization of Models: Different models are assigned distinct roles (e.g., generation agents and critic agents) to enhance the quality of outputs through feedback loops.
Quantitative Validation: The authors provide empirical evidence demonstrating the effectiveness of their approach across various reasoning tasks, showing significant performance improvements over traditional single-agent methods.
Generalization: The finetuned models exhibit the ability to generalize to new datasets without additional training.

Methodology

Multiagent Debate: The authors utilize a debate mechanism where multiple models generate responses to queries, and a majority voting system determines the best output. This process creates a rich dataset for finetuning.
Independent Specialization: Each model is finetuned on different subsets of data generated through multiagent interactions, promoting specialization and diversity in responses.
Iterative Improvement: The multiagent system allows for continuous improvement over many rounds of finetuning, overcoming the limitations of traditional methods.

Results

The results indicate that the multiagent finetuning approach leads to consistent performance gains across multiple iterations, outperforming single-agent self-improvement methods. The models were tested on various reasoning tasks, demonstrating their enhanced capabilities.

Conclusion

The paper presents a promising direction for improving LLMs through multiagent systems, highlighting the potential for sustained performance enhancements and the ability to generalize across different tasks and datasets.

For more detailed insights, you can access the full paper here.