- This topic has 8 replies, 2 voices, and was last updated 7 years, 2 months ago by Tom.
May 26, 2015 at 11:51 am #110425hari
unfortunately, I seem to have a serious problem with Google, after migrating from Theses 1.8.6 to Generate Press. Other Plugins that might come into play – but have not been changed during and after Migration – are S2 Members and iThemes Security.
I have no concrete indication, that the following is caused by Generate Press – just the coincidence, that it startet immediately – the day – after the Migration.
My expectation is not, that you can solve my problem, but maybe, you have an idea. Or maybe you have a hunch, that related to Generate Press.
Some Pages on my Blog are protected via S2 Members, but most of the free content is not, not at all. For years, there have been never ever any crawling errors with this pages and I did not change anything on them.
Starting the day after I migrated to Generate Press, I startet to see the following Crawling Errors from the Google Bot. And they increased over the course of a week and span now EVERY Page on my blog:
Server Error Code 503 URL: “http://www.myblog.xx/wp-login.php?redirect_to=http%3A%2F%2Fwww.myblog.xx%2something%2F
This URL is obviously unaccessable and is bullshit. I have now idea, why the bot tries to access the pages via WP-Login?
Again, the pages are OK and are fully accessible without Login to everyone! And there was never such a problem before, until I migrated.
Some more Information: I dont have a robots.txt yet. I understand, that I might “Disallow: /wp-login.php” in a robots text for the bot. But this is just a shot in the dark and I dont want to do this, until I understand what is going on here.
As EVERY page is affected, I have the Feeling, as if the google bot is completly blocked, but I have no idea, what might have caused that beginning from the day, I migrated to Generate Press.
Unfortunately. I dont have the to option to deinstall S2Members or iThemes Security for a longer timeframe to test the bot, as my Content would be unprotected and that is not acceptable.
Any idea or hint is highly appreciated!May 26, 2015 at 1:16 pm #110441hari
One addition, just a hunch that it might be related:
With the Migration to Generate Press, I included the Link “http://www.myblog.xx/wp-login.php/” to the Main Navigation, that is now shown above the header logo.
So from a rendering viewpoint, “http://www.myblog.xx/wp-login.php/” is now part of every page.
I have a hunch, that this fact might be the Connection to what is going on in the crawler.
What I dont understand and what must be related to Generate Press is, why Googles Crawler creates such a funny link in the following form from it:
This Link cant work and must run in a 503. And so it does and creates the crawler errors.
But the Login page itself is fine and fully accessible. Just not the funny link, Google creates somehow. But how? And how is this related to the theme??
Just an additional Information to nail down the cause ….May 26, 2015 at 11:12 pm #110498TomLead DeveloperLead Developer
GP’s code is so basic and simple – we don’t do any modifying of anything fancy like the wp-login.php page – not sure how GP doesn’t be causing any of this. I would also assume it would be happening to more people if it was GP related.
The redirect_to part of the URL is basic WordPress functionality – it definitely seems as if Google has decided to crawl pages hidden behind wp-login.php, or they’re simply following any links you have on your pages that go there.
a) All a rel=”nofollow” tag to all links pointing to wp-login.php – this will tell Google to ignore the links.
b) Add the code you mentioned above to your robots.txt file – there’s no need for Google to ever reach wp-login.php, so if it keeps trying, it’s better to just block it entirely.
Let me know if either of the above helps or not 🙂May 27, 2015 at 12:32 am #110515hari
as I said, I never expected GP as the definite cause, it is just checking every possibility.
I will add the robots.txt and will test. It will need some days to wait for the results of the bot and I will share them then.
Still this is strange, because it startet after changing the theme, Must be a strange sideeffect or interdependency of some plugins.May 27, 2015 at 12:33 am #110516TomLead DeveloperLead Developer
Very strange, I’m interested to know if blocking Google from your wp-login.php file works or not.
Can’t think of anything theme related, but I’ll definitely keep thinking.
Let me know your results when you get them 🙂May 27, 2015 at 3:03 pm #110772hari
Tom, I must wait now at least 3 days, until I can see the Google Crawler results following the 27th, the day I created the robots.txt
In the meantime I checked the Web and found a few similar problems. I dont want to spam links, just to share what I have found. If you dont want something like this here, let me know.
The problem seems to be old and the cause was never really found:May 29, 2015 at 9:10 am #111191TomLead DeveloperLead DeveloperJune 1, 2015 at 12:48 pm #111851hari
As of 05/28 – the day after I blocked WP-Login.php for the Google bot via robots.txt – I had no more crawling errors. Simply Zero!
So it is safe to say, that the problem was circumvented via the robots.txt.
Nevertheless, it is still open and unresolved, why the problem appeard in timely sync to migrating from Thesis 1.8.6 to Generate Press, but I guess, there will never be a definite answer.
I only hope, that is not an indication of a broader problem, because the crawler should never ever have crawled this strange links, that were completely invalid.
So circumvented it is – at least. Have a nice day!
HariJune 1, 2015 at 1:00 pm #111852TomLead DeveloperLead Developer
Glad it’s fixed.
I’m pretty positive it has something to do with the membership plugin, as the core GP theme is super lightweight and WP.org reviewed – it wouldn’t cause any issues like this.
The URL wasn’t technically invalid, that’s the URL WP uses to login and redirect the user to their desired URL (which required them to login).
Anyways, I’m glad it’s all sorted now.
- You must be logged in to reply to this topic.