Skip to content

R0X4R/github-endpoints

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

github-endpoints

FeaturesWorkflowInstallationUsageConfigurationContribution

It automates URL endpoint discovery across target domains by systematically querying the GitHub Search API, extracting references from raw code files, and utilizing concurrent worker pools to maximize scraping efficiency.

Background

The scraping methodology and multi-index sorting logic implemented in this engine are adapted from the original Python utility found in github-search. While the original script is highly effective for focused lookups, github-endpoints translates those core behavioral patterns into a highly concurrent, modular Go architecture, significantly enhancing throughput, providing robust error tolerance via dynamic token rotation, and dropping clean text results straight into your terminal pipelines.

Features

  • Multi-Index Discovery: Cycles automatically through diverse sorting orders (indexed desc, indexed asc, and relevance tracking) to uncover deeper page listings that GitHub's API limits normally hide.
  • Smart Filtering: Cleans noisy output on the fly by filtering out asset images, style headers, standard system font types, and generic web layouts via a strict regex exclusion matrix.
  • Dynamic Token Management: Rotates through an active pool of personal access tokens automatically. If a token hits a rate limit, the tool pulls it out of rotation seamlessly and keeps scanning with the rest.

Detection Workflow

graph TD
    %% Define Global Styles & Colors
    classDef target fill:#1a1c1e,stroke:#30363d,stroke-width:2px,color:#fff;
    classDef engine fill:#1f242c,stroke:#ffbc00,stroke-width:2px,color:#fff;
    classDef success fill:#1b2a1a,stroke:#2ea44f,stroke-width:2px,color:#fff;
    classDef platform fill:#161b22,stroke:#58a6ff,stroke-width:2px,color:#fff;

    T["🎯 Target Domain:<br>example.com"]:::target

    subgraph CoreEngine ["Scraping Framework"]
        S1["1. Query GitHub Search API"]
        S2["2. Rotate GitHub Token Pool"]
        S3["3. Resolve File URLs to Raw Content"]
    end
    style CoreEngine fill:#0d1117,stroke:#30363d,stroke-dasharray: 5 5

    EX["❌ Exclusion Filter<br>(Skip Images, CSS, Fonts)"]:::platform
    RE["📍 Relative Endpoint Capture<br>(Paths, Scripts, AJAX)"]:::success
    AB["🌐 Absolute URL Capture<br>(External Domain Discovery)"]:::success

    %% Connection Logic Routing
    T --> S1
    S1 --> S2
    S2 --> S3
    S3 --> EX
    EX -->|Pass Match| RE
    EX -->|Pass Match| AB

Loading
  1. Phase 1 (API Ingestion): Queries the code search endpoints with strict target filters, fetching nested code block listings across public repositories.
  2. Phase 2 (Raw Resolution): Automatically transforms standard GitHub view URLs into their raw raw.githubusercontent.com equivalents to extract clean source bytes.
  3. Phase 3 (Regex Extraction): Evaluates file blocks through multiple concurrent worker routines (limited to 30 workers) to capture Javascript variables, relative parameters, AJAX roots, and hardcoded routes.

Installation

go install -v github.com/R0X4R/github-endpoints@latest

Install from source

git clone https://github.com/R0X4R/github-endpoints.git && cd github-endpoints && go install .

Usage

github-endpoints -h

Reference Flags

Short Flag Long Flag Description
-t --token GitHub personal access token (optional if token file exists).
-d --domain Target domain name to search for (Required).
-e --extend Look for subdomains and variations (e.g., test.example.com).
-a --all Show results from external domains found during the search.
-r --relative Include relative paths and localized endpoints in the output.
-s --source Show the specific GitHub file URL where each endpoint was found.
-o --output Path to save the output text file.
-v --verbose Show detailed debug information and page increments during execution.

Operational Examples

Standard concurrent scan execution against a single domain:

github-endpoints -d example.com -t "ghp_XXXXXXXXXXXXXXXXXXXX,ghp_YYYYYYYYYYYYYYYYYYYY"

Include relative endpoints, scrape variations, and save results directly to a text file:

github-endpoints -d tesla.com -e -r -o endpoints.txt

To view real-time adjustments, pagination indexing, or active API responses, append the verbose flag:

github-endpoints -d tesla.com -v

Configuration

Managing Multiple API Keys

To maximize throughput and avoid hitting GitHub's secondary rate limits, you can declare multiple personal access tokens inside a local file rather than pasting them into your command flag.

  • Create a file named .github_tokens inside the same directory as your tool binary.

  • Place your tokens inside, using one token per line:

    ghp_XXXXXXXXXXXXXXXXXXXX
    ghp_YYYYYYYYYYYYYYYYYYYY
    ghp_ZZZZZZZZZZZZZZZZZZZZ
    

The tool will automatically detect this file at launch, validate the inputs, and begin picking random keys for each outgoing API query. If an external error occurs, it drops that broken token and redistributes tasks among the survivors cleanly.

🤝 Contributing

I love PRs! Help me improve this tool with your knowledge, edge-case regex adjustments, and architectural optimization ideas.

If you have discovered a cleaner endpoint extraction pattern, a custom regex exclusion match to drop false positives, or a way to speed up deep pagination queries, feel free to expand the project:

  1. Fork the repository.
  2. Update the regex tables inside pkg/config.go.
  3. Submit a Pull Request detailing your optimizations.

Whether it is improving processing loops, refining CLI formatting rules, or expanding documentation, all contributions are highly appreciated!

Credits

  • Core Structural Logic: Inspired by the research and automation primitives found in github-search by @gwen001.

About

Find URL endpoints by searching GitHub repositories.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages