What is Crawling for SEO?
What is Crawling?
An important aspect of Search Engine Optimization (SEO) to understand Crawling vs Indexing. Crawling is when Google or another search engine sends a bot to a web page or web post and “read” the page. This is what Google Bot or other crawlers ascertain what is on the page. Don’t let this be confused with having that page being indexed. Crawling is the first part of having a search engine recognize your page and show it in search results. Having your page crawled, however, does not necessarily mean your page was (or will be) indexed. To be found in a query from any search engine, you must first be crawled and then indexed.
Upon being created or updated; how does Google know to examine your page?
Pages are crawled for a variety of reasons including:
- Having an XML sitemap with the URL in question submitted to Google
- Having internal links pointing to the page
- Having external links pointing to the page
- Getting a spike in traffic to the page
To ensure that your page gets crawled, you should have an XML sitemap uploaded to Google Search Central, previously known as Google Search Console (formerly Google Webmaster Tools) to give Google the roadmap for all of your new content.
What Does Crawl Mean?
What getting crawled means is that Google is looking at the page. Depending on whether or not Google thinks the content is “New” or otherwise has something to “give to the Internet,” it may schedule to be indexed which means it has the possibility of ranking.
Also, when Google crawls a page, it looks at the links on that page and schedules the Google Bot to check out those pages too. The exception is when a nofollow tag has been added to the link.
What is the Difference Between Crawling and Indexing?
Many terms are continually thrown around in the SEO world, many of which seem to be synonymous. Crawling and Indexing are a perfect example of two words that are used incorrectly. Whether or not the writer understands the difference in meaning, many SEO articles lead readers to believe the two words mean the same thing—but they most definitely do not.
So, exactly what is the difference between crawling and indexing?
Before we get into the difference between crawling and indexing, we must first explain what it means to have your site/page indexed.
In no way does having your page crawled mean that it has been indexed and even has a chance to be found in a Google search.
What Does Being Indexed Mean?
Having your page Indexed by Google is when it can show up in search results. The best explanation of crawling vs indexing is that Google indexes a page AFTER it crawls it (if it deems it worthy). This does not mean that every site that gets crawled by search engine bots gets indexed, but every site indexed had to be crawled. If Google deems your new page worthy of being found, then Google will index it. Once the web crawler indexing is done, Google then comes up with how all the pages should be found in their search.
At this point, Google decides which keywords and what ranking in each keyword search your web pages will land. This is done by a variety of factors that ultimately make up the entire business of SEO. Also, any links on the indexed page is now scheduled for crawling by Google bot and other search engine SEO web crawlers.
It’s not only those links that get crawled; it is said that the Google search crawler bot will search up to five sites back. That means if a page is linked to a page, which linked to a page, which linked to a page which linked to your page (which just got indexed), then all of them will be crawled.
This process is the basis of why external links that come to your site are so important. The higher the quality of the page that ultimately links to you, the better you will rank in the all-powerful Google Search Engine Index.
This is what many SEO companies charge big money for—creating (or allowing the creation of) many links that will come to your site from high-quality web sites using keywords you want to be found by. By no means is this the only thing that an SEO agency might do, but it’s almost guaranteed to be on the list.
How Can I Tell What Google has Indexed?
Although you NEED your site to be crawled, you WANT it to get indexed. There are several ways to determine what Google has indexed on your site.
One is to simply go to Google.com and click on Settings at the bottom right then choose Advanced Search. From there, scroll down to “site or domain” put in your website and hit Search. This will show you everything that Google has indexed. It should include pages, posts, and photos and possibly other such items as feeds.
The preferred way to see exactly what Google has indexed (because you have some control over fixing it) is to use Google Search Console (previously named Google Webmaster Tools). We aren’t covering how to set up Google Search Console in this article, but if you have a website, it NEEDS to be done.
Google Search Console lets you upload an XML Sitemap, which lets you tell Google what you would LIKE for them to index and how often they should check back for changes. Google Search Console also provides a ton of valuable information on your website and is really the only two-way communication with Google that exists.
It is always a good idea to run a quick, free SEO report on your website also. The best, automated SEO audits will provide information on your robots.txt file which is a very important file that lets search engines and crawlers know if they CAN crawl your website. Although some of the free SEO reports you will find across the web may be nothing more than a lead generation tool, One Click SEO offers (what we consider to be) the Best SEO Audit Tool with the promise that no one will harass you.
How Does Google Decide What to Index?
This is the real question everyone should be asking. At the end of the day, Google will index new, fresh content that Google believes will improve the user experience of THEIR clients—the people who go to Google and search for something. They are very picky about trying to provide the most relevant websites for a specific search term. If you’re copying pages or are using copy that’s otherwise already in their index, then there’s no need to index yours.
You may have heard the term “Duplicate Content” thrown around in SEO articles. Duplicate content is a point of contention for many SEO gurus, but I say that at best, it confuses Google on which page to rank, and at worst, you get penalized. At the end of the day, stay away from duplicate content. But I digress.
If what you wrote is BETTER or provides more information or if Google otherwise believes that showing your page as opposed to the other pages will give their clients a better experience, they will index and rank your site. This is why providing fresh, new SEO-rich blog content is so important. The more quality pages indexed with internal links to other pages within your site, the better for SEO.
YAY! Now I Understand SEO!
Not Quite! We are just scratching the surface of what the Google website crawler Bot likes or how to effectively leverage SEO. Depending on your type of business, there are different ways to have your company found in a Google search.
For instance, if you are a bricks-and-mortar type of business with a storefront, you’ll want to focus on Local SEO. Local SEO focus on searches that Google has determined are looking for a local business. These types of searches show a map pack of Google Business Profile’s at the top. For instance, if you wanted to find an SEO Service in New Orleans, you’d Google New Orleans SEO. That type of search will provide you with local results for a Search Engine Optimization Company within a Map Pack. If you’re a dry cleaner, you know this type of searching is important to you, but if you provide online training, then your geographical location isn’t as important.
Each type of business benefits from specific strategies in SEO. As an example if you are a real estate broker or a Realtor, real estate SEO requires optimizing their Google My Business profile to show up in local searches and augmented with content marketing.