LBD #034: Crawl Budget: A Ghost or actual metric?

Reading time: 4 Minutes

TL;DR & Summary

Many website owners and SEO professionals often overlook crawl budget. Crawl budget can affect their website’s visibility on search engines.

Crawl budget is one of the most misunderstood concepts in SEO. Instead of confusing you, I’ll make it easier for you to see this concept.

From today onwards, I insist you to see crawl budget as Crawl Rate. It’s how frequently the crawler is able to crawl your pages. The frequency depends on how easy you make it for the crawlers.

Eg: Having more money can determine, how easily you can buy things, even the most expensive things.

By the end of this issue, you will be able to:

  1. Understand and optimize your website’s crawl budget to improve search engine visibility
  2. Identify obstacles that may affect your crawl budget and how to overcome them
  3. Implement strategies to better control your crawl budget & improve your website’s search engine performance

Many businesses & SEO professionals often overlook crawl budget. As it can negatively impact their website’s search engine visibility.

Without understanding crawl budget, your website’s content may not be effectively crawled let alone indexed. This leads to reduced search engine visibility and hence the organic traffic.

Crawl budget is the number of pages on your website that search engine crawlers will crawl and index within a given timeframe. Notice the words here – Crawl budget is about ‘time’

Easier you make for crawlers, more resources it can crawl. Simple!

Here are some misconceptions that people have about crawl budget:

  1. Thinking crawl budget is only about the number of pages: Many people assume that crawl budget is simply the number of pages a search engine bot will crawl on their website. However, crawl budget is also affected by the time it takes for a page to load, the depth of the page within the site structure, and the crawl rate limits set by the website owner.
  2. Believing that a large number of URLs automatically means a large crawl budget: While it’s true that having too many URLs can lead to a slow crawling, this is not always the case. The quality of the URLs is just as important as the quantity. If the URLs are relevant and of high quality, the search engine bot will likely crawl them more frequently.
  3. All crawling/indexing issues are because of crawl budget: When your pages aren’t getting indexed on search engines, most common problem people think of is the crawl budget. But in 9 out of 10 cases, that’s not the problem. Unless you run an e-commerce or news site with tons and tons of pages.

Notes:

  1. Many website owners make the mistake of assuming that their crawl rate remains constant. However, crawl budget can change over time due to changes in website structure, server errors, and other factors. Regular monitoring of crawl budget is crucial to ensure that search engine bots can crawl and index the website efficiently. Here’s how you can check out the crawl stats in Google Search Console.
  2. Crawl budget is a hypothetical metric that actually doesn’t exist, at least not directly. How difficult or easy it is to crawl the resources of your site defines the crawl budget. There’s no number that can determine good or bad crawl budget. However, considering the fact that making it difficult for the web crawler to access resources determines the success of your site & technically the crawl budget, it’s important to ensure your site is ready for a healthy crawl.

Here are 3 simple steps, to ensure you don’t have any sort of crawling issues:

Step 1: Invest in good web hosting

Since crawl budget is a product of how frequently (read easily) a web crawler is able to crawl the resources on your website, it becomes very important for you to have the environment that supports having visitors & crawler simultaneously.

If you opt for cheap web hosting services, you’re most likely to shoo away the crawler because it is designed to read the strength of the server before it starts crawling.

I recommend using Cloudways or Bluehost to host your website resources. Both web hosting providers are known to serve top notch services to the users and have proven track record welcoming visitors & crawlers with open arms.

Step 2: Utilize http2 protocol

Http2 protocol is to combat very specific problem. Avoiding the technical know hows, it simply means the latency for server request is reduced by a large chunk.

Instead of downloading the resources, the crawler can stream the resources from the servers. Like streaming Netflix on your device, instead of downloading it locally.

The biggest problem that is solved by http2 protocol is that, the crawler now doesn’t have to depend on the strength of the server to withstand the load of both visitors & crawlers.

If you don’t already know, unlike visitors, the crawler requests several queries per second in order to fetch updates from your site. If you run a large ecommerce or news site, you & I can imagine how huge the list of updates can be.

Step 3: Keep sitemaps upto date

Sitemaps are the entry points for the crawlers. No matter how old the site is, the crawler will begin from the site map only. So it’s highly advisable to have the sitemap upto date. If you use WordPress as CMS for your website, you can use plugins like RankMath to keep your sitemap upto date.

Every time you update a post or publish a new one, the site will maintain the record & when the crawler arrives, it can directly jump to those specific pages to requests the updates. These updates will be added to the search index which later will be used to serve results to the end users.


SEO this week (News Updates)

  1. Google published: “How we fought spam on Google search in 2022?
  2. Project Magi: Google search’s new search engine to defend the market share
  3. Private domains are a ranking factor for Bing search?

Clickworthy Resources

  1. How to find your local business listings on Bard? (& how to optimize?)
  2. How useful is Natural Language Processing (NLP) for SEO?

Tools that you should know about

  1. Image search but with AI? Try MiniGPT4
  2. God Mode: What ChatGPT should’ve been

Surf the AI Wave

  1. Nvidia joins the text generation ride. Launches text-to-video tool that can create movies.

That’s a wrap. Until next time 🙂


If you’re looking forward to winning online, here’s how I can help:

  1. Sit with you 1-on-1 & create a content marketing strategy for your startup. Hire me for paid consulting.
  2. Write blogs, social posts, and emails for you. Get in touch here with queries (Please mention you found this email in the newsletter to get noticed quickly)
  3. Join my tribe on Twitter where I share SEO tips (every single day) & teaser of the next issue of Letters ByDavey.