CMSs insist on adding it

Mydomain.com/blog/robots.txt” is useless. Google will not attempt to read it and does not have to pay attention to it. Some  but this is not part of the official robots.txt file definition.

File type and size can affect whether your robots.txt file is not read

Ideally, a robots.txt should be encod in UTF-8 to avoid reading problems. But the truth is that text files can have several encodings. For example, if you create the file from your Windows notepad, it is likely that its format will be different. It is advisable to use more professional plain text itors (a simple and powerful option is notepad++) where, among other things, you are allow to choose the file encoding.

Even so, Google tells us that it can read other encodings

But what happens in these cases is not so much that it can or cannot, but that when generating it, it is written in one encoding and the server 99 acres database returns it in another. This can cause the typical strange characters, which end up in the file not working or not being read properly.

 

special data

 

Even within UTF-8 files there is something

In these files call BOM (Byte Order Mark of the file, which occupies the first line) . Ideally, simple files should not have a BOM, but Google is able kind of knowledge helps to read robots.txt with an initial BOM (although only one and only at the beginning), so if your file has a BOM, nothing happens.

Another limitation is the size

Google limits us to 500MB and if we exce this, it will not read the file. So we have to economize these files, not only to not get close to 500MB, but because they are files that are frequently consult by robots and because their larger size means more tg data processing and network wear on the server.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top