ActivePDF Interview: Jason Stockett Talks Performance
The ActivePDF team includes many knowledgeable experts in the PDF tech space. Recently, we sat down with one such expert Jason Stockett, whose knowldege of ActivePDF products spans many years, to discuss product performance and preparation for peak activity periods. Here’s what Jason had to share:
What kind of performance can we expect out of ActivePDF products?
Jason Stockett: There are a lot of variables that ultimately determine what kind of performance you get – from the file types you are converting and the content of the files to the hardware you are using for your ActivePDF server.
We have run tests on basic machines that are capable of running hundreds of thousands of conversions per day with the DocSight™ DocConverter product. With a high-performance server, it is possible to do even better – increasing per-day conversions.
What kind of hardware should we use for our server?
JS: In my experience, the “speed” of the CPU is the biggest factor in regards to performance. The number of CPU cores will also impact performance when multi-threading, but the clock speed is a bigger factor.
In our testing, we compared a standard 4-core Desktop CPU versus an 8-core Xeon processor of the same generation. We found that while the 8-core Xeon was capable of more simultaneous conversions, a standard grade CPU was so much faster and it was actually able to keep up with half the cores.
The best bet in determining the ultimate CPU is to check the benchmark data for the specific CPU’s you’re working with. This will determine which ones will offer the best performance and how they might scale in relation to each other.
The available RAM is unlikely to have a significant impact in performance. This has more to do with the number and size of documents being converted. For example, a very large file requires more RAM to open and convert than a small text file needs. So, if you have enough RAM to handle the conversions, the speed of the RAM should not have a significant impact on the overall performance. On the other hand, insufficient RAM may prevent the conversion from being completed.
In our testing we find that the HD space being used is very similar to RAM considerations. The speed at which we are writing out the files is approximately the same speed as a standard 7200rpm platter drive. Increasing the performance of the HD does not noticeably increase the conversion rate. The most important part to note here is that you have enough HD space to store your files once complete.
Are there any limits we should be concerned about?
JS: While ActivePDF does not impose any artificial limits, there are some general limits related to the environment or specifications. The bottom line here is that ActivePDF software will process documents as fast as your hardware will allow.
Here’s a good example – in a 32-bit environment you would have a limit of 8,388,607 pages, as each page consumes at least one indirect object. The overall file size of a PDF must be below 10GB (per ISO 32000 PDF standards) as the cross reference tables which define the PDF structure uses 10bits.
In IIS there is a default max file size (30MB for IIS 7). If you tried to process files through IIS larger than 30MB’s, you would be unable to do so without first changing the IIS settings to allow for larger file sizes.
How do I find out what kind of performance my server is capable of achieving?
JS: The best option is to create a simple test to determine the overall performance as conversions per hour. This allows you to test both your hardware and a realistic sampling of the actual files you intend to convert. Run the test with default settings, then make adjustments and run the test again to confirm any significant changes in performance.
Normally, the default settings are pretty good, but depending on exactly what your document load looks like, it may be worth making some adjustments. But since the hardware and documents have a significant impact on the overall performance, testing on your server with your documents is the best way to identify what type of load you will be able to handle.