A command-line tool to transcribe YouTube videos and automatically repurpose the content into summaries, blog posts, and social media snippets. Perfect for content creators looking to quickly transform their video content for various platforms!
-
Video Transcription:
Uses OpenAI Whisper to transcribe audio from YouTube videos. -
Audio Standardization:
Converts downloaded audio into a standardized WAV format (mono, 16 kHz, 32-bit float) for consistent transcription quality. -
Content Repurposing:
Leverages Hugging Face Transformers to generate:- A concise summary of the transcript.
- A detailed blog post based on the transcript.
- A catchy social media snippet.
-
Playlist Support & Parallel Processing:
Supports processing entire YouTube playlists and uses a thread pool for faster transcription.
-
Clone the repository:
git clone git@github.com:JUSTSUJAY/scribly.git cd scribly
-
Create a virtual environment and activate it (optional but recommended):
python -m venv .venv # Windows: .venv\Scripts\activate # macOS/Linux: source .venv/bin/activate
-
Install the required packages:
pip install -r requirements.txt
-
Ensure
ffmpeg
is installed and available in your PATH.
(If not, adjust theget_ffmpeg_path()
function intranscriber.py
with your system's ffmpeg executable path.)to add it in your env variable run
export FFMPEG_PATH=/path/to/ffmpeg
The project is run from the command line. The main entry point is src/main.py
.
To transcribe a YouTube video or playlist:
python src/main.py "https://youtube.com/playlist?list=YOUR_PLAYLIST_ID" --model base
To transcribe and generate a summary, blog post, and social media snippet from each video:
python src/main.py "https://youtube.com/playlist?list=YOUR_PLAYLIST_ID" --model base --repurpose
- links: One or more YouTube video or playlist URLs.
- --model: Specify the Whisper model size. Options:
tiny
,base
,small
,medium
,large
(default isbase
). - --workers: Number of parallel threads for processing videos (default: 4).
- --output: Directory where transcripts and repurposed content will be saved (default:
transcripts
). - --repurpose: Flag to generate summary, blog post, and social snippet from each transcript.
scribly/
├── src/
│ ├── main.py
│ ├── transcriber.py
│ ├── repurposer.py
│ └── utils.py
├── requirements.txt
└── README.md
Contributions are welcome! Feel free to fork the repository and submit pull requests. When contributing, please follow the existing code style and include tests when possible.
This project is open-source and available under the MIT License.
- OpenAI Whisper for robust speech recognition.
- Hugging Face Transformers for state-of-the-art text generation.
- yt-dlp for downloading audio from YouTube.
- Thanks to the community for all the support and contributions!