Features • Workflow • Installation • Usage • Configuration • Contribution
It automates URL endpoint discovery across target domains by systematically querying the GitHub Search API, extracting references from raw code files, and utilizing concurrent worker pools to maximize scraping efficiency.
The scraping methodology and multi-index sorting logic implemented in this engine are adapted from the original Python utility found in github-search. While the original script is highly effective for focused lookups, github-endpoints translates those core behavioral patterns into a highly concurrent, modular Go architecture, significantly enhancing throughput, providing robust error tolerance via dynamic token rotation, and dropping clean text results straight into your terminal pipelines.
- Multi-Index Discovery: Cycles automatically through diverse sorting orders (
indexed desc,indexed asc, and relevance tracking) to uncover deeper page listings that GitHub's API limits normally hide. - Smart Filtering: Cleans noisy output on the fly by filtering out asset images, style headers, standard system font types, and generic web layouts via a strict regex exclusion matrix.
- Dynamic Token Management: Rotates through an active pool of personal access tokens automatically. If a token hits a rate limit, the tool pulls it out of rotation seamlessly and keeps scanning with the rest.
graph TD
%% Define Global Styles & Colors
classDef target fill:#1a1c1e,stroke:#30363d,stroke-width:2px,color:#fff;
classDef engine fill:#1f242c,stroke:#ffbc00,stroke-width:2px,color:#fff;
classDef success fill:#1b2a1a,stroke:#2ea44f,stroke-width:2px,color:#fff;
classDef platform fill:#161b22,stroke:#58a6ff,stroke-width:2px,color:#fff;
T["🎯 Target Domain:<br>example.com"]:::target
subgraph CoreEngine ["Scraping Framework"]
S1["1. Query GitHub Search API"]
S2["2. Rotate GitHub Token Pool"]
S3["3. Resolve File URLs to Raw Content"]
end
style CoreEngine fill:#0d1117,stroke:#30363d,stroke-dasharray: 5 5
EX["❌ Exclusion Filter<br>(Skip Images, CSS, Fonts)"]:::platform
RE["📍 Relative Endpoint Capture<br>(Paths, Scripts, AJAX)"]:::success
AB["🌐 Absolute URL Capture<br>(External Domain Discovery)"]:::success
%% Connection Logic Routing
T --> S1
S1 --> S2
S2 --> S3
S3 --> EX
EX -->|Pass Match| RE
EX -->|Pass Match| AB
- Phase 1 (API Ingestion): Queries the code search endpoints with strict target filters, fetching nested code block listings across public repositories.
- Phase 2 (Raw Resolution): Automatically transforms standard GitHub view URLs into their raw
raw.githubusercontent.comequivalents to extract clean source bytes. - Phase 3 (Regex Extraction): Evaluates file blocks through multiple concurrent worker routines (limited to 30 workers) to capture Javascript variables, relative parameters, AJAX roots, and hardcoded routes.
go install -v github.com/R0X4R/github-endpoints@latestInstall from source
git clone https://github.com/R0X4R/github-endpoints.git && cd github-endpoints && go install .github-endpoints -h| Short Flag | Long Flag | Description |
|---|---|---|
-t |
--token |
GitHub personal access token (optional if token file exists). |
-d |
--domain |
Target domain name to search for (Required). |
-e |
--extend |
Look for subdomains and variations (e.g., test.example.com). |
-a |
--all |
Show results from external domains found during the search. |
-r |
--relative |
Include relative paths and localized endpoints in the output. |
-s |
--source |
Show the specific GitHub file URL where each endpoint was found. |
-o |
--output |
Path to save the output text file. |
-v |
--verbose |
Show detailed debug information and page increments during execution. |
Standard concurrent scan execution against a single domain:
github-endpoints -d example.com -t "ghp_XXXXXXXXXXXXXXXXXXXX,ghp_YYYYYYYYYYYYYYYYYYYY"Include relative endpoints, scrape variations, and save results directly to a text file:
github-endpoints -d tesla.com -e -r -o endpoints.txtTo view real-time adjustments, pagination indexing, or active API responses, append the verbose flag:
github-endpoints -d tesla.com -vTo maximize throughput and avoid hitting GitHub's secondary rate limits, you can declare multiple personal access tokens inside a local file rather than pasting them into your command flag.
-
Create a file named
.github_tokensinside the same directory as your tool binary. -
Place your tokens inside, using one token per line:
ghp_XXXXXXXXXXXXXXXXXXXX ghp_YYYYYYYYYYYYYYYYYYYY ghp_ZZZZZZZZZZZZZZZZZZZZ
The tool will automatically detect this file at launch, validate the inputs, and begin picking random keys for each outgoing API query. If an external error occurs, it drops that broken token and redistributes tasks among the survivors cleanly.
I love PRs! Help me improve this tool with your knowledge, edge-case regex adjustments, and architectural optimization ideas.
If you have discovered a cleaner endpoint extraction pattern, a custom regex exclusion match to drop false positives, or a way to speed up deep pagination queries, feel free to expand the project:
- Fork the repository.
- Update the regex tables inside
pkg/config.go. - Submit a Pull Request detailing your optimizations.
Whether it is improving processing loops, refining CLI formatting rules, or expanding documentation, all contributions are highly appreciated!
- Core Structural Logic: Inspired by the research and automation primitives found in github-search by @gwen001.