Skip to content

Bugfix/unarchive course external symlinks#3037

Open
drdrew42 wants to merge 3 commits into
openwebwork:WeBWorK-2.21from
drdrew42:bugfix/unarchive-course-external-symlinks
Open

Bugfix/unarchive course external symlinks#3037
drdrew42 wants to merge 3 commits into
openwebwork:WeBWorK-2.21from
drdrew42:bugfix/unarchive-course-external-symlinks

Conversation

@drdrew42

@drdrew42 drdrew42 commented Jul 3, 2026

Copy link
Copy Markdown
Member

Summary

On any server whose Archive::Tar includes the recent CVE-2026-42496 / CVE-2026-42497 path-traversal hardening, unarchiving a course fails and the course cannot be restored. Instructor tarball extraction in the File Manager hits the same wall. Three commits address it.

Reproduction

Unarchive any course archive (Course Administration → Unarchive Course):

Symlink 'COURSE/templates/Contrib' target attempts traversal. Not extracting under SECURE EXTRACT MODE
Failed to unarchive course directory for course COURSE

Root cause

Every course archive contains the standard course-template symlinks — Library, Contrib, capaLibrary, Student_Orientation — whose relative targets (e.g. ../../../libraries/...) point outside the course directory.

unarchiveCourse extracts with Archive::Tar->extract() in its default (secure) mode. Recent Archive::Tar refuses, under secure extract mode, to create any symbolic (or hard) link whose target contains a .. component or is absolute, and aborts extraction on the first such entry — so no course can be restored. This is the CVE-2026-42496 (symlink) / CVE-2026-42497 (hardlink) hardening, upstream in Archive::Tar 3.08 and backported into distribution perl packages (e.g. Ubuntu perl-base), so it appears in Archive::Tar builds whose version number (e.g. 2.40) predates the upstream fix.

Approach

An earlier revision of this PR simply set INSECURE_EXTRACT_MODE = 1. That is too blunt: it also disables the older protection against regular files escaping the target via ../ or absolute paths, which we want to keep. This revision keeps secure extract mode on and handles only the links:

  1. unarchiveCourse (CourseManagement.pm): remove the link entries from the archive, extract the remaining files under secure extract mode (regular files still cannot escape via ..//absolute paths), then recreate the symbolic links directly with symlink(). Course archives contain no hard links, so those are removed but not recreated.

  2. Partial-directory cleanup: on extraction failure, remove the partially extracted course directory before moving a displaced course back. Previously a failed restore left an orphaned, database-less stub that blocked a same-name retry with a misleading Cannot overwrite existing course — and, when unarchiving over an existing course, stranded the original course under <id>_tmp (its directory and DB tables having been renamed away by move_away, which move_back then could not restore because the partial occupied the name).

  3. File Manager (FileManager.pm): the tar-extraction path hits the same issue — an instructor extracting a tarball containing symlinks would have them skipped with an error. It already validates each member's location with path_is_subdir, so recreate symbolic-link members directly instead of extracting them.

Security note

Recreated symbolic links are unrestricted in target, matching the pre-3.08 behavior WeBWorK has always relied on for the standard course links. This grants no new read access: access through a symlink remains governed by the existing $webworkDirs{valid_symlinks} course environment option (FileManager::isSymLink), which applies to every symlink regardless of how it was created. Regular-file extraction stays fully protected by secure extract mode, and files are extracted before any link is recreated, so nothing is ever written through a link.

Why it's surfacing now

date event
2026-05-22 Archive::Tar 3.08 released with the linkname validation
2026-05-26 CVE-2026-42496 / CVE-2026-42497 published
2026-06-12 Ubuntu ships the backport (perl-base 5.38.2-3.2ubuntu0.3)

As this rolls out across distributions, every WeBWorK server loses course-unarchive — including the project's own Docker image on the Ubuntu-26 base (#2996).

Testing

Verified on WeBWorK 2.20 / Perl 5.38.2 / Archive::Tar 2.40 with the CVE-2026-42496 backport: a real course archive (containing all four standard template symlinks) fails to unarchive before the change; after it, extraction succeeds, all four links are recreated with correct targets, and the course restores. perltidy clean.

Notes

bin/download-OPL-metadata-release.pl also extracts with Archive::Tar, but it is intentionally left unchanged. CourseManagement.pm is identical on develop.

@drgrice1

drgrice1 commented Jul 3, 2026

Copy link
Copy Markdown
Member

The file manager is also going to need some related changes for this issue. If an instructor extracts a tarball into the templates directory that contains symbolic links, the same issue can occur.

@drgrice1

drgrice1 commented Jul 3, 2026

Copy link
Copy Markdown
Member

This will need more thought.

The INSECURE_EXTRACT_MODE option is not new, and the behavior from before with this not set was desired. That is the prevention of regular files from being extracted outside of the target location. That prevented files with paths like ../../../file from being extracted outside of the course directory. It is the new behavior protecting against hard and symbolic links added in version 3.08 of the Archive::Tar package (and backported to older versions for many distributions) that is a problem.

So either we need to set INSECURE_EXTRACT_MODE to 1, but verify that regular files are not being written outside the desired target location (the course directory in this case), or we need to leave it set to 0, and eliminate symbolic links entirely, and find another way to deal with the current usage of symbolic links.

@drdrew42 drdrew42 force-pushed the bugfix/unarchive-course-external-symlinks branch from 92ee132 to ae10841 Compare July 3, 2026 19:20
drdrew42 and others added 3 commits July 3, 2026 15:29
Newer Archive::Tar refuses, under its secure extract mode, to extract
symbolic and hard links whose targets fall outside the extraction
directory (the CVE-2026-42496 / CVE-2026-42497 hardening in Archive::Tar
3.08, backported into distribution perl packages), and aborts the whole
extraction when an archive contains one.

Every course archive contains the standard template links (Library,
Contrib, capaLibrary, Student_Orientation), whose relative targets point
outside the course directory, so no course could be unarchived once the
system's Archive::Tar carried the fix.

Remove the link entries and extract the remaining files with secure
extract mode still enabled -- so regular files cannot escape the course
directory via absolute or ../ paths -- then recreate the symbolic links
directly. Access through a recreated link remains governed by the
valid_symlinks course environment option, so this grants no new read
access.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
When extraction fails partway through, it leaves the partially extracted
course directory behind. unarchiveCourse only moved a displaced course
back into place, so the incomplete directory remained.

For a fresh restore this left an orphaned, database-less course stub that
blocked a same-name retry with a misleading "Cannot overwrite existing
course" error. When unarchiving over an existing course it was worse:
_unarchiveCourse_move_away has already renamed the existing course to
<id>_tmp, and _unarchiveCourse_move_back then fails because renameCourse
refuses to move it back onto the name still occupied by the partial
extraction, leaving the original course stranded under _tmp.

Remove the partially extracted directory in the failure branch before
moving any displaced course back. _unarchiveCourse_move_away has already
moved a pre-existing course of this name aside, so the directory can only
be the incomplete extraction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The File Manager's tar extraction hits the same problem as course
unarchiving: newer Archive::Tar refuses, under secure extract mode, to
extract symbolic links whose targets leave the extraction directory (the
CVE-2026-42496 hardening), so an instructor extracting a tarball that
contains such links would have those links skipped with an error.

The extraction already validates each member's location with
path_is_subdir, so recreate symbolic-link members directly with symlink()
instead of extracting them. Access through the link remains governed by
the valid_symlinks course environment option.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@drdrew42 drdrew42 force-pushed the bugfix/unarchive-course-external-symlinks branch from ae10841 to b29b201 Compare July 3, 2026 19:30

@drgrice1 drgrice1 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now. Thanks.

The version of Archive::Tar on Ubuntu 26 is version 3.02_001, but it also has this back ported to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants