fbpx

How to Optimize PDFs for SEO (7 Steps)

Google first started indexing PDFs in 2001. The format is commonly used in government, academia, and business environments.

PDFs are great for compatibility and consistency. They work on nearly any device and always maintain the same visual look. However, if you’re creating new content for the web, you should consider using web pages over PDFs.

Below, we’ll explore:

How Google treats PDFs

PDFs show in Google search results with a PDF tag.

Google SEO starter guide PDF in search results.

PDFs are converted to and indexed as HTML. For PDFs where there are images of text, Google uses Optical Character Recognition (OCR) technology to convert the image of text into text. Images in PDFs are also indexed in image search results.

Google chooses pages over PDFs if they’re duplicate. If you have pages and PDFs with the same content, Google tends to prefer the page version of the content as the lead version of the duplicate cluster. This means that signals will be consolidated to the page version and that will be the version that shows in search results.

Why PDFs aren’t great for SEO

Even though Google indexes and occasionally ranks PDFs, the format has a few disadvantages over web pages:

That said, I’m well aware that there are some situations where there’s no way around using a PDF for your content. If that’s the case for you, keep reading to learn how to optimize your PDFs for search.

How to optimize a PDF

Most on-page SEO elements that you’re used to seeing in HTML have an equivalent version in PDFs and are used in the same way you’re used to. Many are also there for accessibility reasons. So let’s discuss a few ways to optimize PDFs for SEO:

1. Write good content

Google’s company mission is to organize the world’s information. Even if it’s not a web page, good content is good content. I’ve seen lots of great content in PDFs like technical documentation, whitepapers, etc. Some of the best information on the web is buried in PDFs.

2. Add an optimized title

Just like web pages have title tags, PDFs have titles. Note that many search engines use the title to describe the document in their search results. If a PDF does not have a title, the filename appears in the SERP instead.

Here’s how to edit a PDFs title in Adobe Acrobat Pro:

3. Add an optimized description

As with meta descriptions for web pages, this isn’t a ranking factor but gives you a shot at controlling the text that appears in search results.

4. Use a relevant file name

The filename of the PDF will be part of the URL. This will impact the URL shown in the search results and is a small ranking factor.

5. Include image alt attributes

To help search engines understand the content of your images, you can add alt text to the images in your PDF.

6. Use headings

Just like your heading tags (H1-H6) in web pages, you can specify that certain text in PDFs are headings.

Just like any page, internal and external links also impact rankings. Links pass PageRank and their anchor text adds context. By including links to your PDF and links from your PDF to other pages, you are helping PageRank flow through your site rather than creating a dead end. Some PDFs get a lot of links. Larry Page once said “It turns out, people who win the Nobel Prize have citations from 10,000 different papers”

Check out this GDPR document. It has 77K links from 823 referring domains to it but does not link out at all. This is a missed opportunity and adding some internal links from this PDF to other pages on the site might help those pages rank better.

Links to GDPR PDF.

This example from Google is better. Their SEO Starter Guide PDF has 3.37K links from 754 referring domains and they do a good job of passing that value to other pages by linking out from the PDF.

Links to Google SEO Starter Guide PDF.

Google does a good job of internal linking in the Google SEO Starter Guide PDF.

To add links in a PDF:

Sidenote.

The screenshots and instructions above are for Acrobat Pro

How to track PDF views

As we mentioned previously, PDFs are more difficult to track. Because of this, many marketing teams tend to gate PDFs or make them available only after a user fills out a form. By doing this, they shift the focus from tracking performance to lead generation. However, there are some options to track your PDFs including:

Event tracking

You can track clicks on PDF links and send them to your analytics system. This allows you to see how many times people clicked on the PDF files to download or open them. You can find out how to set these up here.

If you embed the PDF into a page using JavaScript or an iframe, you can just use the analytics data for the page itself.

Intermediate tracking script

This is a complex solution, but it’s possible to send PDF clicks through an intermediate tracking script that sends data to your analytics system before sending people to your PDF. You can find one example here.

Server logs

Because PDF files are stored on a server, any access requests for the files will be recorded in your log files.

3rd-party data

Because PDFs are rarely tracked in analytics systems, sometimes the best data you have is from another source like Google Search Console or Ahrefs. Ahrefs can also give you data on which of your competitors’ PDFs get the most organic traffic. Just paste their domain into Site Explorer, then go to the Top Pages report and search for URLs containing .pdf

Final thoughts

Hopefully I’ve convinced you that in most cases you should create new content in web pages and not in PDFs, but what about old PDFs, should you optimize the PDF or change them into pages? In typical SEO fashion, I’m going to go with “it depends”. I really don’t think there’s a right or wrong way to do this. Do what is easier for you. Either way should show a positive impact, but depending on the effort and resources the answer could be optimize PDFs, change PDFs into pages, or do something else instead.

Have questions? Let me know on Twitter.

Translate »