GraphChain: Large Language Models for Large-scale Graph Analysis via Tool Chaining

Chunyu Wei¹ Wenji Hu¹ Xingjia Hao² Xin Wang¹ Yifan Yang³ Yunhai Wang¹ Yang Tian² Yueguo Chen¹

¹Renmin University of China ²Guangxi University ³Beijing Jiaotong University

Accepted by Conference on Neural Information Processing Systems

Figure 1: Comparison of Graph Processing Approaches with LLMs. Left: Methods suffer from Context Exhaustion where large graphs exceed LLM context windows. Center: Single-tool approaches face ReasoningHallucination with fixed, predefined tools. Right: Our GraphChain framework enables human-like exploratory analysis through sequential tools that progressively narrow focus in large-scale graphs.

Abstract:

Large Language Models (LLMs) face significant limitations when applied to large-scale graphs, struggling with context constraints and inflexible reasoning. We present GraphChain, a framework that enables LLMs to analyze complex graphs through dynamic sequences of specialized tools, mimicking human exploratory intelligence. Our approach introduces two key innovations: (1) Progressive Graph Distillation, a reinforcement learning mechanism that generates optimized tool sequences balancing task relevance with information compression, and (2) Structure-aware Test-Time Adaptation, which efficiently tailors tool selection strategies to diverse graph topologies using spectral properties and lightweight adapters without costly retraining. Experiments show GraphChain significantly outperforms prior methods, enabling scalable and adaptive LLM-driven graph analysis.

Source Code: https://github.com/GraphChain651/GraphChain

Figures:

Figure 2: (1) Training Phase: Progressive graph distillation where the RL agent learns to select tool sequences that iteratively reduce the memory state's (m) Graph Description Length (GDL) while maximizing task relevance. (2) Structure-aware Test-Time Adaptation: A lightweight adapter (A_ψ) tuned by minimizing chain length and KL divergence generates a structure-specific soft prompt P_G based on the graph's SVD-derived fingerprint z_G

Figure 3: Impact of removing graph distillation or test-time adaptation

Figure 4: Comparison with varying Graph Sizes and Query Complexity.

Figure 5: Distribution of tool types utilized by GraphChain across different graph domains.

Figure 6: A typical case of GraphChain on Financial Networks.

Materials:

Paper (960k)

Acknowledgements:

This research was supported by the National Key R&D Program of China (No. 2023YFC3304701) and in part by the Young Elite Scientists Sponsorship Program by CAST under contract No. 2022QNRC001. It was also supported by Big Data and Responsible Artificial Intelligence for National Governance, Renmin University of China.