<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Felipe Barriga Richards &#187; PDF</title>
	<atom:link href="http://blog.felipebarriga.cl/tag/pdf/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.felipebarriga.cl</link>
	<description>Blog personal de Felipe Barriga Richards</description>
	<lastBuildDate>Thu, 29 Dec 2011 17:15:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Easy crop a PDF in Linux/Unix</title>
		<link>http://blog.felipebarriga.cl/linux/easy-crop-a-pdf-in-linux-unix/</link>
		<comments>http://blog.felipebarriga.cl/linux/easy-crop-a-pdf-in-linux-unix/#comments</comments>
		<pubDate>Wed, 12 May 2010 20:43:57 +0000</pubDate>
		<dc:creator>fbarriga</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[ImageMagick]]></category>
		<category><![CDATA[PDF]]></category>
		<category><![CDATA[pdftoppm]]></category>

		<guid isPermaLink="false">http://blog.felipebarriga.cl/?p=335</guid>
		<description><![CDATA[Sometimes somebody send you a pdf with big white margins that only annoy you. This is often when you receive Power Point presentations in pdf format so is almost impossible to print several slides on one page. To avoid this you can crop the pdf pages and remove those margins. Getting the bounding box First [...]]]></description>
			<content:encoded><![CDATA[<p>Sometimes somebody send you a <a href="http://es.wikipedia.org/wiki/Pdf">pdf</a> with big white margins that only annoy you. This is often when you receive Power Point presentations in pdf format so is almost impossible to print several slides on one page.<br />
<center><br />
<table>
<tr>
<td>
<div id="attachment_349" class="wp-caption aligncenter" style="width: 160px"><a href="http://blog.felipebarriga.cl/wp-content/uploads/2010/05/before_pdf.jpg" rel="lightbox[335]"><img src="http://blog.felipebarriga.cl/wp-content/uploads/2010/05/before_pdf-150x150.jpg" alt="PDF Before Conversion" title="Before" width="150" height="150" class="size-thumbnail wp-image-349" /></a><p class="wp-caption-text">Before</p></div>
</td>
<td>
<div id="attachment_350" class="wp-caption aligncenter" style="width: 160px"><a href="http://blog.felipebarriga.cl/wp-content/uploads/2010/05/after_pdf.jpg" rel="lightbox[335]"><img src="http://blog.felipebarriga.cl/wp-content/uploads/2010/05/after_pdf-150x150.jpg" alt="PDF After Conversion" title="After" width="150" height="150" class="size-thumbnail wp-image-350" /></a><p class="wp-caption-text">After</p></div>
</td>
</tr>
</table>
<p></center><br />
To avoid this you can crop the pdf pages and remove those margins.<br />
<span id="more-335"></span><br />
<br/></p>
<h1>Getting the bounding box</h1>
<p>First execute <a href="http://poppler.freedesktop.org/">pdftoppm</a> to get the first slide and then open it with <a href="http://www.gimp.org/">The Gimp</a>.</p>

<div class="wp-terminal"><font color='#33FF00'>felipe@funstation</font> <font color='#0066FF'>$</font> pdftoppm -f 1 -l 1 -png clase_1_introduccion.pdf test<br/></div>


<div class="wp-terminal"><font color='#33FF00'>felipe@funstation</font> <font color='#0066FF'>$</font> gimp test-01.png<br/></div>

<p>Now put the cursor on the top-left corner of the desired place to begin the crop. Look at the bottom of The Gimp window and write somewhere the coordinates. Now do the same thing with the bottom-right corner. With those numbers you have the (X,Y) coordinates of the top-left and the width and height of the box.</p>
<p>In my case I get:<br />
<center><br />
<table>
<tr>
<td>
<div id="attachment_351" class="wp-caption aligncenter" style="width: 160px"><a href="http://blog.felipebarriga.cl/wp-content/uploads/2010/05/top-left.jpg" rel="lightbox[335]"><img src="http://blog.felipebarriga.cl/wp-content/uploads/2010/05/top-left-150x150.jpg" alt="Top Left Coordinates of PDF" title="Top Left" width="150" height="150" class="size-thumbnail wp-image-351" /></a><p class="wp-caption-text">Top Left</p></div>
</td>
<td>
<div id="attachment_352" class="wp-caption aligncenter" style="width: 160px"><a href="http://blog.felipebarriga.cl/wp-content/uploads/2010/05/bottom-right.jpg" rel="lightbox[335]"><img src="http://blog.felipebarriga.cl/wp-content/uploads/2010/05/bottom-right-150x150.jpg" alt="Bottom Right Coordinates of PDF" title="Bottom Right" width="150" height="150" class="size-thumbnail wp-image-352" /></a><p class="wp-caption-text">Bottom Right</p></div>
</td>
</tr>
</table>
<p></center><br />
<strong>&nbsp;&nbsp;&nbsp;top-left-X = 261<br />
&nbsp;&nbsp;&nbsp;top-left-Y = 198</strong><br />
&nbsp;&nbsp;&nbsp;bottom-right-X = 1014<br />
&nbsp;&nbsp;&nbsp;bottom-right-Y = 1449<br />
&nbsp;&nbsp;&nbsp;<strong>width</strong> &nbsp;= bottom-right-X &#8211; top-left-X = 1014 &#8211; 261 <strong>= 753</strong><br />
&nbsp;&nbsp;&nbsp;<strong>height</strong> = bottom-right-Y &#8211; top-left-Y = 1449 &#8211; 198 <strong>= 1251</strong><br />
<br/></p>
<h1>Cropping and saving the slides as images (png)</h1>

<div class="wp-terminal"><font color='#33FF00'>felipe@funstation</font> <font color='#0066FF'>$</font> pdftoppm -x 261 -W 753 -y 198 -H 1251 -png clase_1_introduccion.pdf clase1_temp<br/></div>

<p>If the images doesn&#8217;t look nice (low resolution, check this using a non anti-aliasing viewer) you can try to increase from 150dpi (default value of pdftoppm) to 600dpi.</p>
<p>The ratio between 600 and 150 is 4:1 so you need to do some complex maths:<br />
&nbsp;&nbsp;&nbsp;&nbsp;<strong>new-top-left-X</strong> = top-left-X * 4 = 261 * 4 <strong>= 1044</strong><br />
&nbsp;&nbsp;&nbsp;&nbsp;<strong>new-top-left-Y</strong> = top-left-Y * 4 = 198 * 4 <strong>= 792</strong><br />
&nbsp;&nbsp;&nbsp;&nbsp;<strong>new-width&nbsp; </strong> = width &nbsp;* 4 = &nbsp; 753 * 4 <strong>= 3012</strong><br />
&nbsp;&nbsp;&nbsp;&nbsp;<strong>new-height</strong> = height * 4 = 1251 * 4 <strong>= 5004</strong></p>
<p>So the new command line is:</p>

<div class="wp-terminal"><font color='#33FF00'>felipe@funstation</font> <font color='#0066FF'>$</font> pdftoppm -r 600 -x 1044 -W 3012 -y 792 -H 5004 -png clase_1_introduccion.pdf clase1_temp<br/></div>

<p><br/></p>
<h1>Merging images in a pdf</h1>
<p>Using the convert command (<a href="http://www.imagemagick.org/">ImageMagick</a>) you can reassemble the images in a single pdf:</p>

<div class="wp-terminal"><font color='#33FF00'>felipe@funstation</font> <font color='#0066FF'>$</font> convert clase1_temp*.png clase_1_introduccion_cropped.pdf<br/></div>

<p><br/></p>
<h1>Deleting temporary files</h1>

<div class="wp-terminal"><font color='#33FF00'>felipe@funstation</font> <font color='#0066FF'>$</font> rm -f test-01.png clase1_temp*.png<br/></div>

<p><br/></p>
<h1>Notes</h1>
<ul>
<li>Using 600dpi instead of 150dpi will increase a bit the size of temporal files and the size of output pdf can increase from 3MB to 14MB. The big problem is that is going to <strong>eat huge amounts of ram memory</strong> (<u>can be several GB</u>) making your computer a bit unstable (only on convert). Also processing time will increase significantly (only on pdftoppm). I recomend to use only 300dpi.</li>
<li>We use the -<a href="http://es.wikipedia.org/wiki/Portable_Network_Graphics">png</a> flag to avoid get the output files in <a href="http://en.wikipedia.org/wiki/Netpbm_format#PPM_example">ppm</a> format. Using this we can save space (in hard drive) in both temp images and final pdf (<strong>it really generate an smaller pdf</strong>).</li>
</ul>
<p><br/></p>
<h1>Example Files</h1>
<ul>
<li><a href='http://blog.felipebarriga.cl/wp-content/uploads/2010/05/clase_1_introduccion.pdf'>clase_1_introduccion.pdf (1.7 MB)</a></li>
<li><a href='http://blog.felipebarriga.cl/wp-content/uploads/2010/05/clase_1_introduccion_cropped.pdf'>clase_1_introduccion_cropped.pdf (8 MB)</a></li>
</ul>
<p><br/></p>
<h1>TODO</h1>
<ul>
<li>Convert png to jpg before making the pdf and see the change of quality v/s size</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.felipebarriga.cl/linux/easy-crop-a-pdf-in-linux-unix/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

