Abstract:Large language models (LLMs) have achieved remarkable progress in recent years. Nevertheless, the prevailing centralized paradigm for training generative artificial intelligence (AI) is increasingly approaching its structural limits. First, the concentration of large-scale graphics processing unit (GPU) clusters restricts the access to the pre-training stage, confining the fundamental model development to a small number of resource-rich institutions. Second, the economic and energy costs associated with operating massive data centers render this paradigm progressively less sustainable. Third, the hardware gatekeeping narrows the participation to computer science specialists, limiting the involvement of domain experts who are essential for high-impact applications. Finally, small- and medium-sized enterprises remain dependent on expensive application programming interface (APIs) or shallow fine-tuning methods that are insufficient to modify the core knowledge of a model. Together, these constraints impede innovation and hinder equitable access to next-generation AI systems. Model fusion offers a scalable alternative by integrating multiple specialized models without retraining from scratch. This paper analyzes the current landscape of model fusion, outlining the strengths and limitations of existing methods and discussing future directions. We highlight recent advances such as InfiFusion, InfiGFusion, and InfiFPO, which improve the alignment and scalability through techniques like top-K logit selection, graph-based distillation, and preference optimization. These techniques demonstrate substantial efficiency and reasoning gains, pointing toward a more accessible and resource-aware paradigm for large-scale model development. Finally, we discuss the practical applicability of model fusion, using the energy domain as an illustrative example.