CVE-2026-26019: @langchain/community affected by SSRF Bypass in RecursiveUrlLoader via insufficient URL origin validation
The RecursiveUrlLoader class in @langchain/community is a web crawler that recursively follows links from a starting URL. Its preventOutside option (enabled by default) is intended to restrict crawling to the same site as the base URL.
The implementation used String.startsWith() to compare URLs, which does not perform semantic URL validation. An attacker who controls content on a crawled page could include links to domains that share a string prefix with the target (e.g., https://example.com.attacker.com passes a startsWith check against https://example.com), causing the crawler to follow links to attacker-controlled or internal infrastructure.
Additionally, the crawler performed no validation against private or reserved IP addresses. A crawled page could include links targeting cloud metadata services (169.254.169.254), localhost, or RFC 1918 addresses, and the crawler would fetch them without restriction.
References
- github.com/advisories/GHSA-gf3v-fwqg-4vh7
- github.com/langchain-ai/langchainjs
- github.com/langchain-ai/langchainjs/commit/d5e3db0d01ab321ec70a875805b2f74aefdadf9d
- github.com/langchain-ai/langchainjs/pull/9990
- github.com/langchain-ai/langchainjs/releases/tag/%40langchain%2Fcommunity%401.1.14
- github.com/langchain-ai/langchainjs/security/advisories/GHSA-gf3v-fwqg-4vh7
- nvd.nist.gov/vuln/detail/CVE-2026-26019
Code Behaviors & Features
Detect and mitigate CVE-2026-26019 with GitLab Dependency Scanning
Secure your software supply chain by verifying that all open source dependencies used in your projects contain no disclosed vulnerabilities. Learn more about Dependency Scanning →