Skip to content

hideyuda/datasync

Repository files navigation

Sync SaaS Tools to GitHub for AI Context

Turn Gmail, Google Drive, Google Calendar, Slack, Chatwork, Google Chat, Notion, Zoom, Microsoft Teams, and Limitless into a Git-backed AI knowledge base. Each template syncs data with GitHub Actions, commits readable files, and lets Cursor, Claude Code, Codex, or another coding agent use those files as local context.

Use this project when you want to sync Slack to GitHub, back up Gmail to GitHub, convert Google Drive to Markdown, export Google Calendar to Markdown, save Zoom transcripts, sync Limitless lifelogs, or keep Notion, Teams, and chat tools available as LLM context without wiring every AI workflow directly to SaaS APIs.

Supported Tools

Tool Template Output Notes
Gmail gmail .eml files and JSON metadata Defaults to recent mail unless GMAIL_FULL_SYNC=true or GMAIL_QUERY is set.
Google Drive gdrive Markdown Exports Google Docs, Slides, and Sheets. Non-Google files are skipped.
Google Calendar gcal Daily Markdown summaries Covers the configured past/future window.
Slack slack Daily JSONL logs Limited to channels the token can read.
Chatwork chatwork Daily JSONL logs Can be scoped to specific rooms.
Google Chat gchat Message data files Requires Google Chat API access and configured spaces.
Notion notion JSON and simple Markdown Saves page/database metadata and properties.
Zoom zoom Recording metadata, Markdown summaries, transcripts Audio/video download is disabled by default.
Microsoft Teams teams Daily JSONL channel logs Uses Microsoft Graph app-only auth; chats and meeting recordings are not enabled by default.
Limitless limitless Lifelog JSON, Markdown, and contents JSONL Supports Pendant lifelogs; audio download is disabled by default.

Why GitHub for AI Context

  1. Files are portable. Cursor, Claude Code, Codex, local scripts, and custom agents can all read ordinary files.
  2. Git gives you history. git log and diffs show what changed between sync runs.
  3. Context stays focused. Agents can open the exact email, meeting note, transcript, or chat log they need instead of loading a large API response.
  4. Sync is decoupled. GitHub Actions handles scheduled updates in the background, so coding workflows do not wait on SaaS APIs.

Project Shape

Each service directory is a standalone, repo-ready template:

gmail/
  src/sync_gmail.py
  requirements.txt
  .github/workflows/sync.yml
  README.md

You can copy one directory into its own private repository, configure secrets, and run it independently. Your main AI or agent repository can then reference those data repositories as Git submodules.

Quick Start

  1. Choose a service template such as gmail, slack, or zoom.
  2. Create a private GitHub repository for that service data.
  3. Copy the template contents into that repository.
  4. Configure the required GitHub Actions secrets and variables.
  5. Run the workflow manually from the Actions tab with workflow_dispatch.
  6. Add the data repository to your main project as a submodule.
git submodule add https://github.com/<user>/gmail data/gmail
git submodule add https://github.com/<user>/gdrive data/gdrive
git submodule add https://github.com/<user>/gcal data/gcal
git submodule add https://github.com/<user>/slack data/slack
git submodule add https://github.com/<user>/chatwork data/chatwork
git submodule add https://github.com/<user>/gchat data/gchat
git submodule add https://github.com/<user>/notion data/notion
git submodule add https://github.com/<user>/zoom data/zoom
git submodule add https://github.com/<user>/teams data/teams
git submodule add https://github.com/<user>/limitless data/limitless

Fetch the latest synced data from your main repository:

git submodule update --remote --merge

Google OAuth Setup

Gmail, Google Drive, and Google Calendar can share one OAuth refresh token when you request all required scopes.

  1. Open the Google Cloud Console.
  2. Create or select a project.
  3. Enable the Gmail API, Google Drive API, and Google Calendar API.
  4. Configure the OAuth consent screen. In testing mode, add your own email as a test user.
  5. Create an OAuth Client ID with application type Desktop App.
  6. Copy the client ID and client secret.

Generate a refresh token with scripts/generate_refresh_token.py:

pip install google-auth-oauthlib
python scripts/generate_refresh_token.py

Use the printed REFRESH_TOKEN in the Google service repositories.

Service Setup

Each service README contains the exact secrets, variables, scopes, and output paths for that template:

Common behavior:

  • Missing secrets are treated as a skip where possible, not as a hard failure.
  • Initial backfills can be large. Start with a narrow query or short lookback window, run manually, then widen the scope.
  • Scheduled workflows commit only when files changed.

Privacy and Repository Size

Synced email, documents, chat logs, calendars, recordings, lifelogs, and transcripts can contain sensitive data. Use private repositories, restrict token scopes, and be careful before enabling broad chat scopes or large file downloads such as Zoom, Teams, or Limitless audio/video.

Git is excellent for text history, but it is not ideal for large binary archives. Prefer Markdown, JSON, JSONL, .eml, and transcript files for an AI knowledge base.

Local Development

Install test dependencies and run the unit tests:

python -m pip install -r requirements-dev.txt
python -m pytest

The current tests focus on API-free sync helpers so they can run without SaaS credentials.

About

Auto-Sync Slack, Gmail, Drive, and Calendar to GitHub for Better AI Context

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages