Skip to content

Enable rollback on install failures during update#40524

Open
yeelam-gordon wants to merge 4 commits into
masterfrom
fix/system-vhd-rollback-and-checks
Open

Enable rollback on install failures during update#40524
yeelam-gordon wants to merge 4 commits into
masterfrom
fix/system-vhd-rollback-and-checks

Conversation

@yeelam-gordon
Copy link
Copy Markdown
Contributor

@yeelam-gordon yeelam-gordon commented May 13, 2026

Fix: all installed files lost on failed MSI upgrade

Fixes #40488

Problem

MajorUpgrade with no Schedule attribute (our previous state) uses the WiX default afterInstallValidate, which removes the old product outside the MSI transaction. If the new install fails afterward, all old files (~30 files, ~1.1GB including system.vhd) are gone with no rollback.

Even Worse: once files are lost, reinstalling does not recover them. Running msiexec /i again reports success but does nothing -- MSI thinks the product is already installed and skips all file operations. Recovery requires msiexec /fa (repair), REINSTALL=ALL, or full uninstall + reinstall -- none of which a typical user would know to try.

Fix

Add Schedule="afterInstallInitialize" -- this moves old product removal inside the transaction. On failure, MSI restores all files from .rbf backups automatically. The unrecoverable state is never reached.

Tradeoff: ~700MB extra temporary disk during upgrade (freed on commit). No other downsides identified -- all custom actions are guarded by (not UPGRADINGPRODUCTCODE).

Not in scope

Locked-file reboot-pending deletes (kernel-mode lock holders that MSIRMSHUTDOWN can't kill). That's a separate issue.

Test results

Scenario Without fix (default) With fix (afterInstallInitialize)
Upgrade fails mid-install ❌ WSL completely broken -- all files gone (system.vhd, wsl.exe, wslservice.exe, ...) ✅ All files automatically restored, WSL still works
User runs installer again to fix it ❌ Installer reports success but does nothing -- files still missing ✅ Not needed -- files were never lost

How the fix works (observed via filesystem monitoring):

During upgrade, old files are atomically renamed to .rbf backups (not deleted). On failure, MSI restores them from the .rbf files automatically.

Copilot AI review requested due to automatic review settings May 13, 2026 14:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens WSL against unrecoverable failures when an MSI major upgrade fails mid-install by (1) moving old-product removal into the MSI transaction for rollback safety and (2) adding a dedicated, localized error path when required packaged VHD files (e.g., system.vhd, modules.vhd) are missing so users get an actionable message instead of a generic HCS failure.

Changes:

  • Adjust WiX MajorUpgrade scheduling so RemoveExistingProducts runs inside the MSI transaction (rollback restores the previous install on failure).
  • Introduce WSL_E_SYSTEM_DISTRO_MISSING and a localized message for missing packaged files.
  • Add runtime existence checks in VM startup paths and wire the new HRESULT into common error-string handling.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
msipackage/package.wix.in Schedules MajorUpgrade removal to occur inside the MSI transaction for rollback protection.
src/windows/service/inc/wslservice.idl Adds new HRESULT WSL_E_SYSTEM_DISTRO_MISSING (0x33).
src/windows/service/exe/WslCoreVm.cpp Replaces debug-only asserts with production checks that throw a user-facing localized error when packaged VHDs are missing.
src/windows/service/exe/HcsVirtualMachine.cpp Adds packaged-file existence validation before attaching boot VHDs (currently without setting a user-facing message).
src/windows/common/wslutil.cpp Adds the new HRESULT to common error mappings and returns a localized fallback string (currently hard-coded to system.vhd).
localization/strings/en-US/Resources.resw Adds MessageSystemDistroMissing localized string resource.

Comment thread src/windows/service/exe/HcsVirtualMachine.cpp Outdated
Comment thread src/windows/common/wslutil.cpp Outdated
Comment thread src/windows/service/exe/WslCoreVm.cpp Outdated
@benhillis
Copy link
Copy Markdown
Member

Thanks for investigating, would it be possible to try to root cause the issue instead of a band-aid? A slightly better error message isn’t going to help users that get into this state.

@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch 2 times, most recently from 2ad3626 to 05c4925 Compare May 14, 2026 03:37
@benhillis benhillis added msix Installer issue. file system labels May 17, 2026
Move MajorUpgrade Schedule to afterInstallInitialize so RemoveExistingProducts
runs inside the MSI transaction. On upgrade failure, the old product is restored
instead of leaving files permanently deleted.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 03:35
@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from 05c4925 to c1f0d2c Compare May 18, 2026 03:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

@yeelam-gordon
Copy link
Copy Markdown
Contributor Author

Apologies for the earlier noise — I've cleaned this up. The PR is now a single-line MSI fix (the root cause), no runtime checks. The previous defense-in-depth changes have been removed to keep this focused on what actually prevents the data loss.

@yeelam-gordon yeelam-gordon marked this pull request as ready for review May 19, 2026 01:11
@yeelam-gordon yeelam-gordon requested a review from a team as a code owner May 19, 2026 01:11
Copilot AI review requested due to automatic review settings May 19, 2026 01:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

ptrivedi
ptrivedi previously approved these changes May 21, 2026
Copy link
Copy Markdown
Contributor

@ptrivedi ptrivedi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be a righteous fix targeted towards preventing data loss. It would be good to keep an eye out at figuring out why some of these installs fail as we look at more issues.

OneBlue
OneBlue previously approved these changes Jun 2, 2026
Copy link
Copy Markdown
Collaborator

@OneBlue OneBlue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading through https://docs.firegiant.com/wix3/xsd/wix/majorupgrade/, this appears to be low risk and should be a direct improvement in case of an upgrade failure.

Change LGTM, although if it all possible I would recommend adding a test failure that artificially fails the upgrade to validate the rollback (potentially doing something similar to what the MsixUpgradeFails() test case does)

@chemwolf6922 chemwolf6922 changed the title Fix system.vhd loss during failed MSI upgrade Enable rollback on install failures during update Jun 5, 2026
Copilot AI review requested due to automatic review settings June 5, 2026 09:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Comment thread test/windows/InstallerTests.cpp Outdated
Comment thread test/windows/InstallerTests.cpp Outdated
Comment thread test/windows/InstallerTests.cpp Outdated
@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from 25ff710 to 09ceb15 Compare June 5, 2026 10:07
Copilot AI review requested due to automatic review settings June 5, 2026 10:17
@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from 09ceb15 to 6ec96bb Compare June 5, 2026 10:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from 6ec96bb to 3ae9004 Compare June 5, 2026 11:16
Copilot AI review requested due to automatic review settings June 5, 2026 14:36
@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from 3ae9004 to 465ae40 Compare June 5, 2026 14:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread test/windows/InstallerTests.cpp Outdated
Comment on lines +755 to +758
// Restore ProductCode and reinstall cleanly for subsequent tests.
restoreProductCode.reset();
InstallMsi();
ValidatePackageInstalledProperly();
@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from 465ae40 to ada5779 Compare June 5, 2026 16:19
Copilot AI review requested due to automatic review settings June 5, 2026 16:30
@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from ada5779 to 0851863 Compare June 5, 2026 16:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread test/windows/InstallerTests.cpp Outdated
Comment on lines +711 to +713
auto wslExePath = m_installedPath / WSL_BINARY_NAME;
VERIFY_IS_TRUE(std::filesystem::exists(wslExePath));

Comment thread test/windows/InstallerTests.cpp Outdated
Comment on lines +745 to +747
// Verify rollback restored the old product's files.
VERIFY_IS_TRUE(std::filesystem::exists(wslExePath));

@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from 0851863 to 8889dc3 Compare June 5, 2026 16:42
Copilot AI review requested due to automatic review settings June 5, 2026 17:43
@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from 8889dc3 to 47bb518 Compare June 5, 2026 17:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment on lines +697 to +701
TEST_METHOD(MsiUpgradeRollbackRestoresFiles)
{
// Remove the MSI package — mirrors the setup in MsixUpgradeFails.
UninstallMsi();
VERIFY_IS_FALSE(IsMsiPackageInstalled());
@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from 47bb518 to 6dd9d9e Compare June 5, 2026 17:51
Add MsiUpgradeRollbackRestoresFiles test that validates the
Schedule="afterInstallInitialize" fix by:

1. Installing an older WSL version (2.0.2)
2. Locking wsl.exe to force the upgrade to fail
3. Verifying rollback restores files and MSI registration
4. Reinstalling current version for subsequent tests

Follows the same pattern as MsixUpgradeFails() but tests the
MSI-to-MSI upgrade path with rollback verification.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 5, 2026 19:02
@yeelam-gordon yeelam-gordon force-pushed the fix/system-vhd-rollback-and-checks branch from 6dd9d9e to 2f06d8d Compare June 5, 2026 19:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Comment on lines +697 to +702
TEST_METHOD(MsiUpgradeRollbackRestoresFiles)
{
// Verify that direct MSI install via msiexec works after uninstall.
// This validates the test infrastructure before testing failure scenarios.
UninstallMsi();
VERIFY_IS_FALSE(IsMsiPackageInstalled());
std::chrono::minutes(2),
[]() { return wil::ResultFromCaughtException() == E_ABORT; });

VERIFY_ARE_EQUAL(0L, exitCode);
Comment on lines +707 to +711
std::wstring commandLine;
THROW_IF_FAILED(wil::GetSystemDirectoryW(commandLine));
commandLine += std::format(L"\\msiexec.exe /qn /norestart /i \"{}\" /L*V \"{}\"", m_msiPath, logPath);

LogInfo("Calling msiexec: %ls", commandLine.c_str());
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot start WSL: Wsl/Service/CreateInstance/CreateVm/MountVhd/HCS/ERROR_FILE_NOT_FOUND

5 participants