Skip to content

Ian-Qu/Duplication_Code_Removal_Tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

repo-dedupe-tool

A small-model-friendly Python CLI that scans a repository, finds duplicate or near-duplicate code/logic, and prepares safe replacement plans.

What it does

  • Walks a repo while honoring .gitignore-style excludes.
  • Extracts Python functions and methods with ast.
  • Normalizes code to improve duplicate detection.
  • Finds exact duplicates by hashing normalized bodies.
  • Finds near-duplicates with embeddings and FAISS.
  • Produces a JSON plan and Markdown report for human review.
  • Emits prompt bundles so a small local LLM can propose replacements with tight context windows.

Install

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run

python -m repo_dedupe_tool scan /path/to/repo --out findings
python -m repo_dedupe_tool prompts findings/plan.json --out findings/prompts

Output

  • report.md: readable summary
  • plan.json: machine-readable candidate actions
  • prompts/*.md: one prompt per duplicate cluster

Export

Use the export command to create a zip archive of the project directory.

python -m repo_dedupe_tool export output/repo_dedupe_tool --zip-path repo_dedupe_tool.zip

Patch generation after user input

The tool is designed to stop at a plan stage first. After user review, you can add a patch-generation step that converts one approved cluster into a unified diff and applies it only after explicit confirmation.

Patch generation

  1. Run scan to produce plan.json.
  2. Review a cluster and prepare a replacement snippet file that contains the approved function body.
  3. Generate a patch:
python -m repo_dedupe_tool make-patch findings/plan.json /path/to/repo \
  --cluster-id cluster-001 \
  --member-key 'src/app.py:10-35:old_func' \
  --replacement-file approved_replacement.py \
  --out findings/patches
  1. Apply it only after confirmation:
python -m repo_dedupe_tool apply-patch findings/patches/cluster-001_old_func.patch /path/to/repo

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages