Summary
PdfReader.Open fails on any PDF whose page(s) do not carry a direct /MediaBox but inherit it from a parent /Pages node. /MediaBox is an inheritable page attribute (PDF 32000-1:2008, §7.7.3.4, Table 31), so this is a valid, common construction — e.g. every PDF produced by ComponentOne (GrapeCity) C1.C1Preview PdfExportProvider is laid out this way.
The page object is constructed (which reads/validates /MediaBox) before the inherited attributes are applied, so the lookup fails.
Version
- Broken: PDFsharp 7.0.0-preview-1 (NuGet), net8.0. The same code is present on
main (HEAD).
- Works: PDFsharp 6.2.4 and 1.50.5147 — both open the same file as a 612×792 page.
So this is a regression introduced by the 7.0 page-tree rewrite; the inherited-attribute handling that the 6.x reader did correctly was
Repro
Attached: c1-inherited-mediabox.pdf — a trivial one-page document produced by ComponentOne PdfExportProvider. Its page tree is:
10 0 obj <</Type /Pages /Count 1 /MediaBox [0 0 612 792] /Resources 11 0 R /Kids [12 0 R]>>
12 0 obj <</Type /Page /Parent 10 0 R /Contents 2 0 R /Resources 11 0 R>>
/MediaBox is on the /Pages node (obj 10) and inherited by the page (obj 12), which has none of its own.
using var fs = File.OpenRead("c1-inherited-mediabox.pdf");
var doc = PdfReader.Open(fs, PdfDocumentOpenMode.Import); // throws
Expected
The page opens with size 612×792, resolving /MediaBox via inheritance (as 1.50 and other readers do).
Actual
System.InvalidOperationException: Page has no MediaBox.
at PdfSharp.Pdf.PdfPage.Initialize(Boolean setupSizeFromMediaBox)
at PdfSharp.Pdf.PdfPage..ctor(PdfDictionary dict)
at PdfSharp.Pdf.ElementsBase.CreateContainer(Type type, PdfContainer oldContainer, Boolean createIndirect)
at PdfSharp.Pdf.PdfPages.TraversePageTree(List`1 pages, PdfPageTreeNode treeNode, InheritedValues inheritedValues)
at PdfSharp.Pdf.PdfPages.FlattenPageTree()
at PdfSharp.Pdf.IO.PdfReader.OpenFromStream(...)
at PdfSharp.Pdf.IO.PdfReader.Open(...)
Root cause
In PdfPages.TraversePageTree (src/foundation/src/PDFsharp/src/PdfSharp/Pdf/PdfPages.cs):
var page = (PdfPage)kid.Elements.CreateContainer(typeof(PdfPage), kid, true); // (1) constructs PdfPage -> throws
page.ApplyInheritedValues(ref inheritedValues); // (2) inherited /MediaBox applied here, too late
CreateContainer (1) runs the PdfPage(PdfDictionary) constructor, which calls Initialize(setupSizeFromMediaBox: true). In PdfPage.Initialize (.../Pdf/PdfPage.cs):
var rectangle = Elements.GetRectangle(InheritablePageKeys.MediaBox, false);
if (rectangle.IsEmpty)
throw new InvalidOperationException("Page has no MediaBox.");
GetRectangle(..., false) reads only the page's own dictionary (no inheritance), so for an inheriting page it is empty and Initialize throws — before ApplyInheritedValues (2) ever runs.
Suggested fix
Apply inherited values before the page's size is set up — e.g. call ApplyInheritedValues prior to (or as part of) Initialize, or have Initialize/GetRectangle consult inherited attributes (walk /Parent) when the page has no direct /MediaBox, or defer the MediaBox validation until after inheritance has been promoted in TraversePageTree.
Summary
PdfReader.Openfails on any PDF whose page(s) do not carry a direct/MediaBoxbut inherit it from a parent/Pagesnode./MediaBoxis an inheritable page attribute (PDF 32000-1:2008, §7.7.3.4, Table 31), so this is a valid, common construction — e.g. every PDF produced by ComponentOne (GrapeCity)C1.C1PreviewPdfExportProvideris laid out this way.The page object is constructed (which reads/validates
/MediaBox) before the inherited attributes are applied, so the lookup fails.Version
main(HEAD).So this is a regression introduced by the 7.0 page-tree rewrite; the inherited-attribute handling that the 6.x reader did correctly was
Repro
Attached: c1-inherited-mediabox.pdf — a trivial one-page document produced by ComponentOne
PdfExportProvider. Its page tree is:/MediaBoxis on the/Pagesnode (obj 10) and inherited by the page (obj 12), which has none of its own.Expected
The page opens with size 612×792, resolving
/MediaBoxvia inheritance (as 1.50 and other readers do).Actual
Root cause
In
PdfPages.TraversePageTree(src/foundation/src/PDFsharp/src/PdfSharp/Pdf/PdfPages.cs):CreateContainer(1) runs thePdfPage(PdfDictionary)constructor, which callsInitialize(setupSizeFromMediaBox: true). InPdfPage.Initialize(.../Pdf/PdfPage.cs):GetRectangle(..., false)reads only the page's own dictionary (no inheritance), so for an inheriting page it is empty andInitializethrows — beforeApplyInheritedValues(2) ever runs.Suggested fix
Apply inherited values before the page's size is set up — e.g. call
ApplyInheritedValuesprior to (or as part of)Initialize, or haveInitialize/GetRectangleconsult inherited attributes (walk/Parent) when the page has no direct/MediaBox, or defer the MediaBox validation until after inheritance has been promoted inTraversePageTree.