# SVN → Git Migration Plan: BRANCH_MAINT_4_04 ## Overview **Source:** `http://build:8080/svn/Software/Views/DTS.Suite/branches/BRANCH_MAINT_4_04` **Target:** New Git repository **Scope:** Single project branch, full history preservation **Estimated Time:** 1-2 hours --- ## SVN Repository Structure ``` http://build:8080/svn/Software/ └── Views/ └── DTS.Suite/ └── branches/ └── BRANCH_MAINT_4_04/ ← This working copy ├── Common/ ├── DataPRO/ ├── DataPRO_sql/ ├── DTS Viewer/ └── ... ``` Note: Non-standard SVN layout (Views/... instead of trunk/branches at root). --- ## Prerequisites ```bash # Install tools on macOS brew install git svn # Verify network access ping build ``` --- ## Phase 1: Author Mapping (~10 min) ### 1.1 Extract SVN Authors ```bash svn log http://build:8080/svn/Software/Views/DTS.Suite/branches/BRANCH_MAINT_4_04 --quiet | grep "^r" | awk '{print $3}' | sort -u > svn-authors.txt ``` ### 1.2 Create authors.txt Transform the extracted list into Git format: ``` svn_username = Full Name jdoe = John Doe tsmith = Tom Smith ``` **Format:** `svn_username = Git Name ` --- ## Phase 2: Binary Exclusion Strategy ### 2.1 Patterns to Exclude Based on repository analysis (366 binary files identified): | Category | Pattern | Reason | |----------|---------|--------| | Build outputs | `**/bin/`, `**/obj/` | Generated by build | | DLLs | `*.dll` | NuGet restore or build output | | Executables | `*.exe` | Build output or redistributable | | Installers | `*.msi` | Build artifact | | Packages | `*.nupkg`, `**/packages/` | NuGet restore | | Debug files | `*.pdb` | Build output | | Database files | `*.mdf`, `*.ldf` | Development data | | PDFs (large) | `*.pdf` | Documentation, not source | ### 2.2 .gitignore Template ```gitignore # Build outputs **/bin/ **/obj/ *.dll *.exe *.pdb # Installers & redistributables **/DataPRO Installer/**/*.msi **/DataPRO Installer/**/*.exe **/Redistributables/ # Database files *.mdf *.ldf # Packages (restore via NuGet) **/packages/ *.nupkg # IDE & OS files .vs/ .idea/ *.user *.suo .DS_Store # Generated files *.Designer.cs *.g.cs *.g.i.cs # AI enrichment (optional) enriched/ enriched-qwen3-coder-next/ .vectordb/ ``` ### 2.3 Third-Party DLLs Decision **Question:** Are any third-party DLLs required in source control (not available via NuGet)? If yes, track exceptions: ``` !Common/DTS.CommonCore/lib/ThirdParty/required.dll ``` --- ## Phase 3: Migration (~30-90 min) ### 3.1 Clone SVN to Git ```bash git svn clone \ --authors-file=authors.txt \ --no-metadata \ --prefix=svn/ \ http://build:8080/svn/Software/Views/DTS.Suite/branches/BRANCH_MAINT_4_04 \ BRANCH_MAINT_4_04-git ``` **Flags:** - `--authors-file`: Maps SVN users → Git identities - `--no-metadata`: Cleaner commits (no `git-svn-id` lines) - `--prefix=svn/`: Remote-tracking branch naming ### 3.2 Alternative: git svn init + fetch (for better control) If the clone is interrupted or you need more control: ```bash mkdir BRANCH_MAINT_4_04-git cd BRANCH_MAINT_4_04-git git svn init --authors-file=../authors.txt --no-metadata \ http://build:8080/svn/Software/Views/DTS.Suite/branches/BRANCH_MAINT_4_04 git svn fetch ``` --- ## Phase 4: Post-Clone Cleanup (~15 min) ### 4.1 Navigate to New Repo ```bash cd BRANCH_MAINT_4_04-git ``` ### 4.2 Add .gitignore ```bash # Create .gitignore with content from Phase 2.2 ``` ### 4.3 Remove Binaries from Git Tracking ```bash # Remove build outputs (keep locally) git rm -r --cached '**/bin/' git rm -r --cached '**/obj/' git rm -r --cached '**/packages/' # Remove binaries git rm --cached '*.dll' git rm --cached '*.exe' git rm --cached '*.msi' git rm --cached '*.pdb' git rm --cached '*.mdf' # Commit cleanup git add .gitignore git commit -m "Add .gitignore, remove binaries from tracking" ``` ### 4.4 Clean Repository Size (Optional) ```bash # Remove large files from entire history (destructive) git filter-branch --force --index-filter \ 'git rm --cached --ignore-unmatch *.dll *.exe *.msi *.mdf' \ --prune-empty --tag-name-filter cat -- --all ``` Then garbage collect: ```bash git reflog expire --expire=now --all git gc --prune=now --aggressive ``` --- ## Phase 5: Push to Remote (~5 min) ### 5.1 Create Remote Repository Options: - **GitHub:** `gh repo create datapro --private` - **GitLab:** Create via UI - **Self-hosted:** `git init --bare` on server ### 5.2 Push ```bash git remote add origin git@github.com:your-org/datapro.git git branch -M main git push -u origin main ``` --- ## Phase 6: Verification ```bash # Check history git log --oneline | head -20 # Check file count git ls-files | wc -l # Check for missed binaries git ls-files | grep -E '\.(dll|exe|msi|mdf)$' # Verify author mapping git log --format='%an <%ae>' | sort -u ``` --- ## Timeline Summary | Step | Time | Risk | |------|------|------| | Install tools | 5 min | Low | | Extract authors & create mapping | 10 min | Low | | git svn clone | 30-90 min | Medium (network) | | Cleanup & .gitignore | 15 min | Low | | Push to remote | 5 min | Low | **Total:** ~1-2 hours --- ## Open Questions 1. **Migration machine:** Run on Mac or machine on same network as SVN server? 2. **Git hosting:** GitHub, GitLab, or self-hosted? 3. **Third-party DLLs:** Any that must stay in source control? 4. **Private files:** Any secrets/configs to exclude before push? --- ## Rollback Plan If migration fails: 1. Original SVN working copy is unaffected 2. Delete `BRANCH_MAINT_4_04-git/` and retry 3. SVN server remains authoritative until Git push succeeds --- ## Post-Migration - [ ] Update CI/CD pipelines to use Git - [ ] Notify team of new repository location - [ ] Set SVN branch to read-only (optional) - [ ] Document new workflow in team wiki