Asked  6 Months ago    Answers:  5   Viewed   59 times

There has been many Questions recently about drawing PDF's.

Yes, you can render PDF's very easily with a UIWebView but this cant give the performance and functionality that you would expect from a good PDF viewer.

You can draw a PDF page to a CALayer or to a UIImage. Apple even have sample code to show how draw a large PDF in a Zoomable UIScrollview

But the same issues keep cropping up.

UIImage Method:

  1. PDF's in a UIImage don't optically scale as well as a Layer approach.
  2. The CPU and memory hit on generating the UIImages from a PDFcontext limits/prevents using it to create a real-time render of new zoom-levels.

CATiledLayer Method:

  1. Theres a significant Overhead (time) drawing a full PDF page to a CALayer: individual tiles can be seen rendering (even with a tileSize tweak)
  2. CALayers cant be prepared ahead of time (rendered off-screen).

Generally PDF viewers are pretty heavy on memory too. Even monitor the memory usage of apple's zoomable PDF example.

In my current project, I'm developing a PDF viewer and am rendering a UIImage of a page in a separate thread (issues here too!) and presenting it while the scale is x1. CATiledLayer rendering kicks in once the scale is >1. iBooks takes a similar double take approach as if you scroll the pages you can see a lower res version of the page for just less than a second before a crisp version appears.

Im rendering 2 pages each side of the page in focus so that the PDF image is ready to mask the layer before it starts drawing.Pages are destroyed again when they are +2 pages away from the focused page.

Does anyone have any insights, no matter how small or obvious to improve the performance/ memory handling of Drawing PDF's? or any other issues discussed here?

EDIT: Some Tips (Credit- Luke Mcneice,VdesmedT,Matt Gallagher,Johann):

  • Save any media to disk when you can.

  • Use larger tileSizes if rendering on TiledLayers

  • init frequently used arrays with placeholder objects, alternitively another design approach is this one

  • Note that images will render faster than a CGPDFPageRef

  • Use NSOperations or GCD & Blocks to prepare pages ahead of time.

  • call CGContextSetInterpolationQuality(ctx, kCGInterpolationHigh); CGContextSetRenderingIntent(ctx, kCGRenderingIntentDefault); before CGContextDrawPDFPage to reduce memory usage while drawing

  • init'ing your NSOperations with a docRef is a bad idea (memory), wrap the docRef into a singleton.

  • Cancel needless NSOperations When you can, especially if they will be using memory, beware of leaving contexts open though!

  • Recycle page objects and destroy unused views

  • Close any open Contexts as soon as you don't need them

  • on receiving memory warnings release and reload the DocRef and any page Caches

Other PDF Features:

  • Getting Links inside a PDF (and here and here)

    • Understanding the PDF Rect for link positioning

    • Converting PDF annot datestrings

    • Getting the target of the link (Getting the page number from the /Dest array)

  • Getting a table of contents

  • Document title and Keywords

  • Getting Raw Text (and here and Here and here (positioning focused))

  • Searching(and here) (doesn't work with all PDFs (some just show weird characters, I guess it's an encoding issue but I'm not sure) -Credit BrainFeeder)

  • CALayer and Off-Screen Rendering - render the next page for fast/smooth display

Documentation

  • Quartz PDFObjects (Used for meta info, annotations, thumbs)
  • Abobe PDF Spec

Example projects

  • Apple/ ZoomingPDF - zooming, UIScrollView, CATiledLayer
  • vfr/ reader - zooming, paging, UIScrollView, CATiledView
  • brow/ leaves - paging with nice transitions
  • / skim - everything it seems (PDF reader/editor for OSX)

 Answers

42

I have build such kind of application using approximatively the same approach except :

  • I cache the generated image on the disk and always generate two to three images in advance in a separate thread.
  • I don't overlay with a UIImage but instead draw the image in the layer when zooming is 1. Those tiles will be released automatically when memory warnings are issued.

Whenever the user start zooming, I acquire the CGPDFPage and render it using the appropriate CTM. The code in - (void)drawLayer: (CALayer*)layer inContext: (CGContextRef) context is like :

CGAffineTransform currentCTM = CGContextGetCTM(context);    
if (currentCTM.a == 1.0 && baseImage) {
    //Calculate ideal scale
    CGFloat scaleForWidth = baseImage.size.width/self.bounds.size.width;
    CGFloat scaleForHeight = baseImage.size.height/self.bounds.size.height; 
    CGFloat imageScaleFactor = MAX(scaleForWidth, scaleForHeight);

    CGSize imageSize = CGSizeMake(baseImage.size.width/imageScaleFactor, baseImage.size.height/imageScaleFactor);
    CGRect imageRect = CGRectMake((self.bounds.size.width-imageSize.width)/2, (self.bounds.size.height-imageSize.height)/2, imageSize.width, imageSize.height);
    CGContextDrawImage(context, imageRect, [baseImage CGImage]);
} else {
    @synchronized(issue) { 
        CGPDFPageRef pdfPage = CGPDFDocumentGetPage(issue.pdfDoc, pageIndex+1);
        pdfToPageTransform = CGPDFPageGetDrawingTransform(pdfPage, kCGPDFMediaBox, layer.bounds, 0, true);
        CGContextConcatCTM(context, pdfToPageTransform);    
        CGContextDrawPDFPage(context, pdfPage);
    }
}

issue is the object containg the CGPDFDocumentRef. I synchronize the part where I access the pdfDoc property because I release it and recreate it when receiving memoryWarnings. It seems that the CGPDFDocumentRef object do some internal caching that I did not find how to get rid of.

Tuesday, June 1, 2021
 
Ultimater
answered 6 Months ago
14

!!! ATTENTION !!!
!!! THIS ONLY WORKS FOR PYTHON 2 !!!!

I am currently working on an update for python3

viranthas pypdfocr is not working properly with python 3.
For use with python 2, happily use the version below.

Finally I came to a solution I can work with.

Using pypdfocr and its pypdfocr_gs library I call

pypdfocr.pypdfocr_gs.PyGs({}).make_img_from_pdf(pdf_file)

to retrieve jpg images and then I use PIL to get ImageTk.PhotoImage instances from it and use them in my code.

ImageTk.PhotoImage(_img_file_handle)

Will add a proper example as soon as I can.

Edit:

As promised here comes the code


    import pypdfocr.pypdfocr_gs as pdfImg
    from PIL import Image, ImageTk
    import Tkinter as tk
    import ttk

    import glob, os

    root=tk.Tk()

    __f_tmp=glob.glob(pdfImg.PyGs({}).make_img_from_pdf("tmptest.pdf")[1])[0]
    #                             ^ this is needed for a "default"-Config
    __img=Image.open(__f_tmp)

    __tk_img=ImageTk.PhotoImage(__img)

    ttk.Label(root, image=__tk_img).grid()

    __img.close()
    os.remove(__f_tmp)

    root.mainloop()

Edit:

Using viranthas pypdfocr version there seems to be a bug inside the handling of Windows 10 and pythons subprocess:

# extract from pypdfocr_gs:
def _run_gs(self, options, output_filename, pdf_filename):
        try:
            cmd = '%s -q -dNOPAUSE %s -sOutputFile="%s" "%s" -c quit' % (self.binary, options, output_filename, pdf_filename)

            logging.info(cmd)        

            # Change this line for Windows 10:
            # out = subprocess.check_output(cmd, shell=True)
            out = subprocess.check_output(cmd)
# end of extract
Monday, August 2, 2021
 
Philip Weiser
answered 4 Months ago
48

A another simple way to do this is setting

pdfView.usePageViewController(true) 

This adds the swiping between pages for you and no need to set up your own gestures. See example below:

override func viewDidLoad() {
    super.viewDidLoad()

    // Add PDFView to view controller.
    let pdfView = PDFView(frame: self.view.bounds)
    self.view.addSubview(pdfView)

    // Configure PDFView to be one page at a time swiping horizontally
    pdfView.autoScales = true
    pdfView.displayMode = .singlePage
    pdfView.displayDirection = .horizontal
    pdfView.usePageViewController(true)

    // load PDF
    let webUrl: URL! = URL(string: url)
    pdfView.document = PDFDocument(url: webUrl!)
}
Tuesday, August 24, 2021
 
Platinum Azure
answered 3 Months ago
28

You can use the Poppler library for that.

Tuesday, October 12, 2021
 
John Oleynik
answered 2 Months ago
89

Well, Since nobody replied. I think there is a bug in the framework, so I'll post what worked for me, after some time of trial and error.

let initialBounds = annotation.bounds
annotation.bounds = CGRect(
        origin: locationOnPage,
        size: initialBounds.size)
page.removeAnnotation(annotation)
page.addAnnotation(annotation)

It's not elegant, but it does the job

Saturday, October 23, 2021
 
Roddy
answered 1 Month ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share