So you have written a web article, but Google does not notice. Here are the things you need to check.
You might want to follow all these steps if you suspect several pages to be affected. You could, however, simply check an individual page as described below.
Make sure that your article is publicly visible
First of all, let us rule out that by mistake you have not published the article properly. Maybe you have published it in a way that only those users can read it that are logged into your site.#
This is how to check:
Open your article in an anonymous window. In Chrome it is called “incognito window”, you can open one with the shortcut CTRL+shift+N or via a right mouse click. In Firefox it is a “private window”. You can get it with CTRL+shift+P.
If this was the cause of the problem, then make that article readable and inform Google (see below).
Check whether or not Google knows your article.
Open a new browser tab and enter “site:yoursite.com” and have a look which pages Google knows on your website. For example: site:heikoevermann.com shows all articles on my developer blog.
If you get nothing here, then you indeed have a problem with your site that you need to solve.
Note: you can even check a single page in the same way: site:yoursite.com/fullpath.
Try this: site:heikoevermann.com/the-singleton-design-pattern-in-abap-objects This is how Google shows an article that it knows.
If google doesn’t know, then you have to check. This might be perfectly normal if your site is brand new. I once registered a new site in the Google Search Console and it took some days for all articles to appear. At first site:mydomain.com did not list anything. Then it was one article, a day later it was two. Pages continued to trickle in until finally all pages were there. You should find your first URL within a week or so. If not, you might want to continue with the checks below.
Check your sitemap
So Google does not know your article or maybe even your website. The next thing to check is your sitemap. A sitemap is a technical page in XML format. It is built to be machine readable and it tells the Googlebot which pages on your website you want to see indexed.
Theoretically you don’t need a sitemap. Google explains that in detail in this article. But they also state that it can never harm. And in most cases it will help. Especially
- If your site is large.
- If your site is new and few people are linking to it so far.
- If your articles are not well linked to one another. That is a topic of its own. You should improve your site by interlinking your articles. But a sitemap should be enough to make your articles visible.
Fortunately a sitemap is easy to generate, at least if your site runs on WordPress. Since WordPress 5.5, sitemaps are a builtin feature of WordPress. Before you needed plugins to get that job done, plugins like Yoast or RankMath.
The default sitemap can be found in yourdomain.com/wp-sitemap.xml. But an SEO plugin like RankMath might forward that to yourdomain.com/sitemap_index.xml.
Often the sitemap is also listed in yourdomain.com/robots.txt like this one:
The robots.txt file is a hint to the Googlebot (and other bots) which roads on your website it is allowed to take and which not.
This is the link to one of my websites: https://lern-platt.de/sitemap_index.xml. Yours probably looks similar.
You will notice that the sitemap is distributed over several XML files, one for WordPress posts, one for WordPress pages, maybe one for categories or media file.
Now let’s have a look at one of these in detail. You will see, that this technical XML file is still quite readable.
You can see all the pages that your webpage wants Google to know. So if one file that you miss is not listed here, this is the first thing to check. If your trouble is caused here, then you should check your Cache plugin. Try refreshing the cache.
Check your Google console
You certainly have already registered your website with the Google Search Console: https://search.google.com/search-console
So let us have a look at some of that data to check what is happening. In the GSC please click on “Sitemaps”. You can see the sitemap that you submitted to Google. There should only be one. You can see
- when you submitted your sitemap
- when it was last read. This should be a quite recent date. Google peridically visits your website. If not, that is something to work out.
- how many URLs Google has found
The important thing to look out for is the status. It should show a green “success”. If you find an error here, then you need to check. This could happen if there is some kind of access filter on your website, that somehow keeps the Googlebot out, so that it cannot read your sitemap.
Please note that you can click on the sitemap_index.xml file, you can then find more details:
So by now you should have an overview over the number of articles that Google has found in your sitemap.
Check the coverage in the GSC
Please click on “coverage” in the GSC menu.
Please note that by clicking on the coloured fields you can switch these fields on or off.
Below this graphical overview you can find a list how google evaluated your web pages:
This list might contain quite a number of pages that you have not declared in your sitemap. The reason is simple: Google not only processes your sitemap. Google also tries to follow any link on your website.
It is perfectly normal that several URLs are “excluded”, e.g. when you get a status 404. 404 means that Google once knew a file, but it is no longer there, because you deleted it.
Filter the GSC against your sitemap
There is an important filter in the GSC coverage view.
Here you can chose
- all known pages
- all submitted pages
- filter to sitemap
Options 2 and 3 are the same if you only have one sitemap. (BTW: you normally should only have one.)
Make sure that you have selected the gray area “excluded”. You can tick off the green area “valid”. Now you have filtered for those pages that
- you intended Google to accept, because you have registered them via your sitemap but
- that Google has discarded.
Check individual URLs for errors
As you are currently sorting out problems, you probably have just found some pages that Google did not like. So let’s start to sort out why.
Click on one of the URLs. Then on the right you have a popup. You can check, whether your robots.txt allows Google to process that page. The problem should rarely be here.
Or you can click “inspect URL”. As mentioned above, there is also a direct way to get there:
Simply copy the page URL from your website and enter it in the search box in the header area of the GSC. You can do this when there is only one page that you would like to check.
Please just remember that this shortcut will not give you a complete overview. And when one page is missing, there might also be more that Google has not found.
Deal with the errors
This is an actual case. It is an article that I would have expected Google to rank.
Google tells me that it has discovered the file through a sitemap. The Googlebot also found a “referring page”, a page that linked to my missing page. (A page should have some other page that is referring to it, not just the sitemap.) But it has never actually crawled that page.
Now there are two things I can try:
- I can manually request Google to index that page or
- I can check the live page.
Let us test the live URL first. After clicking on that link I get this popup:
After a little while this is the result:
So basically Google is fine with my web page. Google just somehow forgot to process my page. I noticed that this happens every now and then. This was especially the case when google diverted a lot of crawling powers to the snippet analysis in early 2021.
In other cases you might find real errors that you have to deal with. I once had a page where Google stated that there was too little content on the page. It was a very short article indeed, just a joke with an image. Google did not find that funny and insisted that the page needed more content.
In another case Google stated that the page was not mobile friendly. For Google that was reason enough not to include the page into its index. Such mobile friendly problems seem to come and go. Basically my WordPress theme was fine and fortunately a recheck solved the problem.
It might be interesting to collect different cases. So if you have an interesting example, please send me a screen shot to add to this article.
Manual indexing
As there is nothing to fix, let’s click on “request indexing”.
After a short while this is what we get:
Google will now process the page. This might take a day or two. So much for today. Let us wait until google has processed the page.
What happened next?
I submitted the page on March 14, 2020. It appears under “submitted and indexed” since March 17, 2020. So the fix did work.
As mentioned above: if your page still does not appear on Google then please send me a screenshot to make a more detailed list of difficult cases and how to solve them.