Delivering generated pdfs through ASP.Net
We've had a customer complaining about the performance of generating pdfs using ActivePDF Toolkit over ASP.Net.
For the most part, we're taking existing, fielded pdf templates, filling them with instance data, and then shipping the bytes back down the IIS request stream.
I pulled the toolkit C# assembly into Reflector, and it appears to be a pretty thin veneer around the C++ code, so it looks like we have a lot of overhead marshaling bytes from the C# to the C++ and back before we fulfill the request. This is also in keeping with the memory footprint we see in IIS when we do this.
Where we can, I'm thinking of using fewer
TK.InputByteArray = PdfBytes; // marshaling the bytes from C# to C++
calls and replacing them with TK.OpenInputFile(path) calls to cut the C# -> C++ marshaling overhead...
But I got to wondering about how to cut the overhead of marshaling the result to the C# response stream.
Has anyone ever tried to use named pipes between C++ and C# to cut the byte shuffle down? If so, how did that work out?
and then having your C# code open "\\.\pipe\myrequestpipe" and read say 10k blocks from it and write them to Response.OutputStream.
Our customers producing pdfs in Asia are getting 20+ meg pdf results, so the overhead of having 20 meg in the C++, 20 meg in C# after the TK.BinaryImage(), and 20 meg in Response.OutputStream adds up.
If I can drop the 20 meg in the middle out with a named pipe that could help the load.
I have a couple of recommendations right off the bat :
1) After setting TK.InputByteArray, you can set PdfBytes = null so the GC can deal with it. The internal TK .NET interface code does marshal the byte array, but only temporarily as it makes a copy into Toolkit's internal memory manager.
2) Calling .Dispose directly after your done with the file instead of waiting for the GC will also free up internal TK memory structures.
Insofar as your follow up comment : What TK is "seeing" is not a named pipe but rather a pointer to a mount point called \\.\pipe .
I have a couple of other recommendations/questions :
1) Are you loading the input file via a byte array because it's coming from a centralized source? (eg SQL Server) If so, have you considered implementing a cache system for the input file? It might actually perform faster in the long run.
2) You mentioned that you are dealing with PDF's form templates originating in Asia. When dealing with templates like these, the biggest issue are CJK fonts. If you are not using one of the Adobe Acrobat built in fonts, the entire font is typically placed into the PDF because the actual characters used cannot be anticipated...the result is a 20MB PDF where 16MB of it is just the font! A couple of things you can do in this situation : 1) Use the built in Adobe Acrobat built in CJK fonts. Toolkit completely supports those as well as all the encodings for those fonts. 2) Host the font on your server but don't embed in the template PDF. Toolkit will pick it up and use external font.
3) Do the users actually interact with these PDF forms once they are filled out? If not, have you considered flattening them AND making sure subset fonts is turned on? Toolkit will squeeze that 20MB file way down.
I hope some of these suggestions help. I can't always say when I'll be poking around, so for further assistance I recommend opening a ticket in Support.
Please sign in to leave a comment.