GraphChain: Large Language Models for Large-scale Graph Analysis via Tool Chaining

Chunyu Wei1     Wenji Hu1     Xingjia Hao2     Xin Wang1     Yifan Yang3     Yunhai Wang1     Yang Tian2     Yueguo Chen1    

1Renmin University of China     2Guangxi University     3Beijing Jiaotong University    

Accepted by Conference on Neural Information Processing Systems

Figure 1: Comparison of Graph Processing Approaches with LLMs. Left: Methods suffer from Context Exhaustion where large graphs exceed LLM context windows. Center: Single-tool approaches face ReasoningHallucination with fixed, predefined tools. Right: Our GraphChain framework enables human-like exploratory analysis through sequential tools that progressively narrow focus in large-scale graphs.


Abstract:

Large Language Models (LLMs) face significant limitations when applied to large-scale graphs, struggling with context constraints and inflexible reasoning. We present GraphChain, a framework that enables LLMs to analyze complex graphs through dynamic sequences of specialized tools, mimicking human exploratory intelligence. Our approach introduces two key innovations: (1) Progressive Graph Distillation, a reinforcement learning mechanism that generates optimized tool sequences balancing task relevance with information compression, and (2) Structure-aware Test-Time Adaptation, which efficiently tailors tool selection strategies to diverse graph topologies using spectral properties and lightweight adapters without costly retraining. Experiments show GraphChain significantly outperforms prior methods, enabling scalable and adaptive LLM-driven graph analysis.

Source Code: https://github.com/GraphChain651/GraphChain




Figures:





Figure 2: (1) Training Phase: Progressive graph distillation where the RL agent learns to select tool sequences that iteratively reduce the memory state's (m) Graph Description Length (GDL) while maximizing task relevance. (2) Structure-aware Test-Time Adaptation: A lightweight adapter (Aψ) tuned by minimizing chain length and KL divergence generates a structure-specific soft prompt PG based on the graph's SVD-derived fingerprint zG



Figure 3: Impact of removing graph distillation or test-time adaptation



Figure 4: Comparison with varying Graph Sizes and Query Complexity.



Figure 5: Distribution of tool types utilized by GraphChain across different graph domains.



Figure 6: A typical case of GraphChain on Financial Networks.



Materials:





1
Paper (960k)

Acknowledgements:

This research was supported by the National Key R&D Program of China (No. 2023YFC3304701) and in part by the Young Elite Scientists Sponsorship Program by CAST under contract No. 2022QNRC001. It was also supported by Big Data and Responsible Artificial Intelligence for National Governance, Renmin University of China.