5.9 KiB
5.9 KiB
SVN → Git Migration Plan: BRANCH_MAINT_4_04
Overview
Source: http://build:8080/svn/Software/Views/DTS.Suite/branches/BRANCH_MAINT_4_04
Target: New Git repository
Scope: Single project branch, full history preservation
Estimated Time: 1-2 hours
SVN Repository Structure
http://build:8080/svn/Software/
└── Views/
└── DTS.Suite/
└── branches/
└── BRANCH_MAINT_4_04/ ← This working copy
├── Common/
├── DataPRO/
├── DataPRO_sql/
├── DTS Viewer/
└── ...
Note: Non-standard SVN layout (Views/... instead of trunk/branches at root).
Prerequisites
# Install tools on macOS
brew install git svn
# Verify network access
ping build
Phase 1: Author Mapping (~10 min)
1.1 Extract SVN Authors
svn log http://build:8080/svn/Software/Views/DTS.Suite/branches/BRANCH_MAINT_4_04 --quiet | grep "^r" | awk '{print $3}' | sort -u > svn-authors.txt
1.2 Create authors.txt
Transform the extracted list into Git format:
svn_username = Full Name <email@example.com>
jdoe = John Doe <john@company.com>
tsmith = Tom Smith <tom@company.com>
Format: svn_username = Git Name <email>
Phase 2: Binary Exclusion Strategy
2.1 Patterns to Exclude
Based on repository analysis (366 binary files identified):
| Category | Pattern | Reason |
|---|---|---|
| Build outputs | **/bin/, **/obj/ |
Generated by build |
| DLLs | *.dll |
NuGet restore or build output |
| Executables | *.exe |
Build output or redistributable |
| Installers | *.msi |
Build artifact |
| Packages | *.nupkg, **/packages/ |
NuGet restore |
| Debug files | *.pdb |
Build output |
| Database files | *.mdf, *.ldf |
Development data |
| PDFs (large) | *.pdf |
Documentation, not source |
2.2 .gitignore Template
# Build outputs
**/bin/
**/obj/
*.dll
*.exe
*.pdb
# Installers & redistributables
**/DataPRO Installer/**/*.msi
**/DataPRO Installer/**/*.exe
**/Redistributables/
# Database files
*.mdf
*.ldf
# Packages (restore via NuGet)
**/packages/
*.nupkg
# IDE & OS files
.vs/
.idea/
*.user
*.suo
.DS_Store
# Generated files
*.Designer.cs
*.g.cs
*.g.i.cs
# AI enrichment (optional)
enriched/
enriched-qwen3-coder-next/
.vectordb/
2.3 Third-Party DLLs Decision
Question: Are any third-party DLLs required in source control (not available via NuGet)?
If yes, track exceptions:
!Common/DTS.CommonCore/lib/ThirdParty/required.dll
Phase 3: Migration (~30-90 min)
3.1 Clone SVN to Git
git svn clone \
--authors-file=authors.txt \
--no-metadata \
--prefix=svn/ \
http://build:8080/svn/Software/Views/DTS.Suite/branches/BRANCH_MAINT_4_04 \
BRANCH_MAINT_4_04-git
Flags:
--authors-file: Maps SVN users → Git identities--no-metadata: Cleaner commits (nogit-svn-idlines)--prefix=svn/: Remote-tracking branch naming
3.2 Alternative: git svn init + fetch (for better control)
If the clone is interrupted or you need more control:
mkdir BRANCH_MAINT_4_04-git
cd BRANCH_MAINT_4_04-git
git svn init --authors-file=../authors.txt --no-metadata \
http://build:8080/svn/Software/Views/DTS.Suite/branches/BRANCH_MAINT_4_04
git svn fetch
Phase 4: Post-Clone Cleanup (~15 min)
4.1 Navigate to New Repo
cd BRANCH_MAINT_4_04-git
4.2 Add .gitignore
# Create .gitignore with content from Phase 2.2
4.3 Remove Binaries from Git Tracking
# Remove build outputs (keep locally)
git rm -r --cached '**/bin/'
git rm -r --cached '**/obj/'
git rm -r --cached '**/packages/'
# Remove binaries
git rm --cached '*.dll'
git rm --cached '*.exe'
git rm --cached '*.msi'
git rm --cached '*.pdb'
git rm --cached '*.mdf'
# Commit cleanup
git add .gitignore
git commit -m "Add .gitignore, remove binaries from tracking"
4.4 Clean Repository Size (Optional)
# Remove large files from entire history (destructive)
git filter-branch --force --index-filter \
'git rm --cached --ignore-unmatch *.dll *.exe *.msi *.mdf' \
--prune-empty --tag-name-filter cat -- --all
Then garbage collect:
git reflog expire --expire=now --all
git gc --prune=now --aggressive
Phase 5: Push to Remote (~5 min)
5.1 Create Remote Repository
Options:
- GitHub:
gh repo create datapro --private - GitLab: Create via UI
- Self-hosted:
git init --bareon server
5.2 Push
git remote add origin git@github.com:your-org/datapro.git
git branch -M main
git push -u origin main
Phase 6: Verification
# Check history
git log --oneline | head -20
# Check file count
git ls-files | wc -l
# Check for missed binaries
git ls-files | grep -E '\.(dll|exe|msi|mdf)$'
# Verify author mapping
git log --format='%an <%ae>' | sort -u
Timeline Summary
| Step | Time | Risk |
|---|---|---|
| Install tools | 5 min | Low |
| Extract authors & create mapping | 10 min | Low |
| git svn clone | 30-90 min | Medium (network) |
| Cleanup & .gitignore | 15 min | Low |
| Push to remote | 5 min | Low |
Total: ~1-2 hours
Open Questions
- Migration machine: Run on Mac or machine on same network as SVN server?
- Git hosting: GitHub, GitLab, or self-hosted?
- Third-party DLLs: Any that must stay in source control?
- Private files: Any secrets/configs to exclude before push?
Rollback Plan
If migration fails:
- Original SVN working copy is unaffected
- Delete
BRANCH_MAINT_4_04-git/and retry - SVN server remains authoritative until Git push succeeds
Post-Migration
- Update CI/CD pipelines to use Git
- Notify team of new repository location
- Set SVN branch to read-only (optional)
- Document new workflow in team wiki