So maybe it's the status code? Shouldn't that page return a 200 ok?
When I go to blog.james..., I first get a 301 moved permanently, and then journal.james... loads, but it returns a 304 not modified, even if i then reload the page.
Only when I fully sumbit the URL again in the URL-bar, it responds with a 200.
Maybe crawling also returns a 304, and Google won't index that?
Maybe prompt: "why would a 301 redirect lead to a 304 not modified instead of a 200 ok?", "would this 'break' Google's crawler?"
> When Google's crawler follows the 301 to the new URL and receives a 304, it gets no content body. The 304 response basically says "use what you cached"—but the crawler's cache might be empty or stale for that specific URL location, leaving Google with nothing to index.
You get a 304 because your browser tells the server what it has cached, and the server says "nothing changed, use that". In browsers you can bypass the cache by using Ctrl-F5, or in the developer tools you can usually disable caching while they're open. Doing so shows that the server is doing the right thing.
That's a different situation. The browser decides what to do depending on the situation and what was communicated about caching. Sometimes it sends a request to the server along with information about what it already has. Then it can get back a 304. Other times it already knows the cached data is fine, so it doesn't send a request to the server in the first place. The developer tools show this as a cached 200.
Has anyone noticed that the response for the blog page has a header: "x-robots-tag: noindex, nofollow"? What's the purpose of this header on a content page?
UPD: Sorry, never mind, I inspected a wrong response.
Request URL: https://journal.james-zhan.com/google-de-indexed-my-entire-b...
Request Method: GET
Status Code: 304 Not Modified
So maybe it's the status code? Shouldn't that page return a 200 ok?
When I go to blog.james..., I first get a 301 moved permanently, and then journal.james... loads, but it returns a 304 not modified, even if i then reload the page.
Only when I fully sumbit the URL again in the URL-bar, it responds with a 200.
Maybe crawling also returns a 304, and Google won't index that?
Maybe prompt: "why would a 301 redirect lead to a 304 not modified instead of a 200 ok?", "would this 'break' Google's crawler?"
> When Google's crawler follows the 301 to the new URL and receives a 304, it gets no content body. The 304 response basically says "use what you cached"—but the crawler's cache might be empty or stale for that specific URL location, leaving Google with nothing to index.