Flash on a PDF with miniPDF.py…

February 11, 2010

Due to the recent advances in exploitation techniques it became really important to put flash every were we can.


In this post we are going to show how to add a swf(Flash) file to a PDF file using our miniPDF.py lib.

Flash support is relatively new in PDF and come into the scene primary for doing the PDF portable collection thing and such. We’ll follow the steps described in Adobe® Supplement to the ISO 32000 , so you probably need to grab it and keep it close to you. In the case you’ve missed the previous posts here you have a copy of the miniPDF.py so you can take a quick look. We are going to use that lib mainly as we did in earlier posts and start adding PDF objects until… –FLASH!– we end up with a one paged PDF with a running embedded SWF. OK, so lets start…

First we import the lib and create a PDFDoc object representing a document in memory …

doc = PDFDoc()

… prepare an empty content stream for the page and add it to the document.

contents = PDFStream('')

The minimal page object. We construct it and add it to the document like this…

page = PDFDict()
page.add("Type", PDFName("Page"))
page.add("Contents", PDFRef(contents))

… then we need the list of pages. In this case containing just or blank page.

pages = PDFDict()
pages.add("Type", PDFName("Pages"))
pages.add("Kids", PDFArray(PDFRef(page)))
pages.add("Count", PDFNum(1))

Let’s be nice and honor the PDF structure as stated in .We link the page to its parent.

page.add("Parent", PDFRef(pages))

And finally we add the catalog wich is the root object of this PDF.

catalog = PDFDict()
catalog.add("Type", PDFName("Catalog"))
catalog.add("Pages", PDFRef(pages))

If we render that like this…

print doc

we’ll get a clean minimalistic PDF file with just one blank page.

Here you have the mkMINIPDF.py python file and the generated example.

-Hey Mom look what I did!! A mini blank PDf file!!! look! look!

Not so exiting though.

The annotation

As stated in the Adobe® Supplement to the ISO 32000 flash support in PDF is implemented as a type of annotation. More precisely, annotation type “RichMedia”. So we go back to the PDF32000 specification section 12.5 and take a look what a annotation is.

”’An annotation associates an object such as a note, sound, or movie with a location on a page of a PDF document, or provides a way to interact with the user by means of the mouse and keyboard. PDF includes a wide variety of standard annotation types.”’

So we construct the RichMedia annotation object with all the required fields …

annot = PDFDict()
annot.add('Rect','[ 266 116 430 204 ]')

… and we add it to our page.

page.add("Annots", PDFArray([PDFRef(annot)]))

This has nothing to do with flash yet. If we keep going trough the Adobe® Supplement to the ISO 32000 in TABLE 9.49 there is a list of the extra annotation entries specific to a RichMedia annotation. Wich are RichMediaSettings and RichMediaContents. So let’s add those two to the annotation dictionary.

Add a RichMediaSetting empty container to the document..

RMS = PDFDict()

… then the same with the a RichMediaContent dictionary.

RMC = PDFDict()

Both empty for now, we add it to the annotation..

annot.add('RichMediaSettings', PDFRef(RMS))
annot.add('RichMediaContent', PDFRef(RMC))

The RichMediaSettings

”’Annotation described in Section 9.5.1 of the PDF Reference. The RichMediaSettings dictionary stores the conditions and responses that occur in response to certain events, such as activation and deactivation of the annotation, and contains two dictionaries.”’

For the RichMediaSettings dictionary we need an activation and a deactivation dictionaries basically telling when the annotation should activate and deactivate…

First we add the activation dictionary. The ‘PO’ condition means ‘when the page containing the annotation is opened’. There are other options in the doc.

activation = PDFDict()
activation.add('Type', PDFName('RichMediaActivation'))
activation.add('Condition', PDFName('PO'))

And the deactivation dictionary. The ‘XD’ means ‘run until deactivated by the user’.

deactivation = PDFDict()
deactivation.add('Type', PDFName('RichMediaDeactivation'))
deactivation.add('Condition', PDFName('XD'))

And then the RichMediaSettings, flagging the annotations as being of type ‘Flash’. Note that we’ve already constructed and added an empty object representing this a couple of line before. We just populate it.

RMS.add('Activation', PDFRef(activation))
RMS.add('Deactivation', PDFRef(deactivation))

The RichMediaContents

”’The RichMediaContent dictionary contains content that is present within the annotation as referenced by the RichMediaSettings dictionary. ”’

For the RichMediaContent dictionary we first need at least two things. The assets, a name tree of embedded file specification dictionaries. And a bunch of RichMediaConfiguration dictionaries.

The assets is the one pointing to the files involved as, for example, our .swf file. An asset name tree has this look:

29 0 obj
<< /Names    [      (Flash.swf) 31 0 R    ] >>

We take the file embedding functionality from this post. And will not trait it here, there is enough PDF madness with the Flash part. The _zipEmbeddeFile function take a filename return a filespec object after embedding the file into the PDF doc. We take the Flash filename from the first argument to the python.

assets = PDFDict()
swfname = PDFString(sys.argv[1])
efref = PDFRef(_zipEmbeddeFile(doc, sys.argv[1]))
assets.add('Names',PDFArray([swfname, efref]))

Now we need the RichMediaConfiguration dictionaries that wich in our case will be just one (see Adobe® Supplement to the ISO 32000#TABLE 9.51).

RichMediaConfiguration Dictionary

”’The RichMediaConfiguration dictionary describes a set of instances that are loaded for a given scene configuration. The configuration to be loaded when an annotation is activated is referenced by the Configuration key in the RichMediaActivation dictionary specified in the RichMediaSettings dictionary.”’

But first lets declare the instances array we need for the RichMediaConfiruration. We’ll populate it in a while.

instances = []

And the actual RichMediaConfiguration.

RMCfg = PDFDict()
RMCfg.add('Instances', PDFArray(instances))

And now we have most of the necessary for the RichMediaContent, lets add it…

RMC = PDFDict()
RMC.add('Type', PDFName('RichMediaContent'))
RMC.add('Assets', PDFRef(assets))

But we have leaved the instances array empty, and erg.. we need it so..

RichMediaInstance Dictionary

”’The RichMediaInstance dictionary, referenced by the Instances entry of the RichMediaConfiguration dictionary (“RichMediaConfiguration Dictionary” on page 88), describes a single instance of an asset with settings to populate the artwork of an annotation, as described in Table 9.51b.”’

We are basically going to use this for designating wich embedded file is the flash and for passing arguments to it. Yes we can pass arguments to it!!!

The RichMediaInstances array has this look:

15 0 obj                    % RichMediaInstances array
[  17 0 obj ]
17 0 obj
<< /Type /RichMediaInstance
   /Subtype /Flash
   /Asset 31 0 R
   /Params 18 0 R

And now we put together our only RichMediaInstance dictionary (see Adobe® Supplement to the ISO 32000#TABLE 9.51b)…

RMI = PDFDict()

And add it to the list of instances referenced from RichMediaConfiguration dict.


Also for passing parameters we could add a RichMediaParams dictionary (see Adobe® Supplement to the ISO 32000#TABLE 9.51c). We get the parameters from the content of the file named in the second argument passed to the python.

RMParams = PDFDict()
RMParams.add('Type', PDFName('RichMediaParams'))
RMParams.add('FlashVars', PDFString(file(sys.argv[2]).read()))
RMParams.add('Binding', PDFName('Background'))

Also we need to link it from the RichMediaInstance…


THAT’S IT!!! We only need to render the PDF…

print doc

Uff! Finally! The resulting python has this from ant it runs like this

python mkSWFPDF.py myFlash.swf myFlashVarsInAfile.vars >SWFPDF.pdf

And it works!!! I took a swf from the web and put it, here is the screenshot…

And HERE you have the test bundle with all this.

Untested and related: Also in my tests the authplay.dll, the dll providing all the Flash functionality to the Adobe Reader, is loaded at a fixed address in XPSPx when in IE or stand alone, wich means you can bypass DEP trough some ret2authplay.dll. Also when in stand alone the Reader dosn’t opt in for DEP


One Response to “Flash on a PDF with miniPDF.py…”

  1. Ange said

    Nice !

    2 little windows-related fixes:
    the SWF file should be read as binary, and redirecting the output creates wrong return caracters.
    @@ -6,7 +6,7 @@
    import zlib,sys,md5

    def _zipEmbeddeFile(doc, f,minimal=False):
    – fileStr = file(f).read()
    + fileStr = file(f, “rb”).read()
    embedded = PDFStream(fileStr)
    if not minimal:
    embedded.add(‘Type’, PDFName(‘EmbeddedFile’))
    @@ -167,7 +167,9 @@

    #add the RichMedia annotation to the 1st page
    page.add(“Annots”, PDFArray([PDFRef(annot)]))
    -print doc
    +f = open(sys.argv[3], “wb”)

    also you didn’t include the (empty) vars file.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: