icon Get the most out of Surmunity, read our tips here! Need an interesting blog to read? You've got to read the Surpass Blog! | Welcome! Please register to access all of our features.

» Surpass Web Hosting Forums » Discussions » All Things Techy » Site Maintenance » Google Feedfetcher being blocked?

Site Maintenance Program updates, securing your website, creating backups.

Reply
 
LinkBack Thread Tools Search this Thread
Old November 9th, 2007, 3:48 PM   #10 (permalink)
Surpassing Dutch
Super #1
 
Edwin's Avatar
 
Joined in Sep 2004
Hosted on SH98
2,548 posts
Gave thanks: 188
Thanked 45 times
Don't know
__________________
sh98
Edwin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 13th, 2007, 4:04 PM   #11 (permalink)
Bringing Sexy Back
Seasoned Poster
 
LissaKay's Avatar
 
Joined in May 2006
Lives in Knoxville, TN
Hosted on SH130
95 posts
Gave thanks: 0
Thanked 5 times
Thumbs down

This is still unresolved. I am beyond frustrated ... somebody needs to fix this!

The latest from Google support:
Quote:
Unfortunatly, from our
end, this still looks a misconfigured server on your side. To verify
this, I used curl, a standard Unix utility that simulates fetches.
Google's servers add a "If-Modified-Since" header to requests to be
more efficient when fetching, so that an empty response can be
returned if nothing has changed since we last requested this feed. I
ran this command:

$ curl -v -A "Feedfetcher-Google; (+http://www.google.com/
feedfetcher.html)" -z "Fri, 09 Nov 2007 22:45:06 GMT"
http://www.lissakay.com/index.php/weblog/rss_2.0/

This is what was output (the full output):
* About to connect() to www.lissakay.com port 80
* Trying 72.29.89.97... * connected
* Connected to www.lissakay.com (72.29.89.97) port 80
> GET /index.php/weblog/rss_2.0/ HTTP/1.1

User-Agent: Feedfetcher-Google; (+http://www.google.com/
feedfetcher.html)
Host: www.lissakay.com
Pragma: no-cache
Accept: */*
If-Modified-Since: Fri, 09 Nov 2007 20:45:06 GMT

< HTTP/1.1 200 OK
< Date: Fri, 09 Nov 2007 22:49:24 GMT
< Server: Apache/1.3.37 (Unix) mod_auth_passthrough/1.8 mod_log_bytes/
1.2 mod_bwlimited/1.4 FrontPage/5.0.2.2635.SR1.2 mod_ssl/2.8.28
OpenSSL/0.9.7a PHP-CGI/0.1b
< Expires: Fri, 09 Nov 2007 06:25:23 GMT
< Cache-Control: no-store, no-cache, must-revalidate, post-check=0,
pre-check=0
* Connection #0 to host www.lissakay.com left intact
* Closing connection #0

Note that the body is empty, even though a 200 OK response code was
given. If the server was cofigured correctly, it would either return a
304 Not Modified with no body, or a 200 OK response with a full body.
Since it's not doing either of those, there's not much we can do. Note
that other URLs on your servers (I believed you mentioned
http://www.lissakay.com/institches/i...tches/RSS_2.0/ at
some point) behave correctly in response to If-Modified-Since
requests, so perhaps you can see what they are doing differently.

Mihai Parparita
Google Reader Engineer
Surely someone somewhere knows how to resolve this!
__________________
LissaKay.com is on SH130
LissaKay is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 14th, 2007, 1:17 AM   #12 (permalink)
Race Surpass
Super #1
 
MarkRH's Avatar
 
Joined in Jul 2006
Lives in Oklahoma City, OK
Hosted on sh102
1,222 posts
Gave thanks: 18
Thanked 86 times
My WordPress feed seems to be updating fine with Google Reader, looked at my most recent visitors to it and saw:
http://www.markheadrick.com/images/s...eedfetcher.gif

Might check your latest visitors list and see how many bytes are being transmitted out with a code of 200.

Kind of reminds me of those cases where a burned CD will not play in Player A but will play in Player B. But Player A will play other burned CDs. Which is broken? The CD or the Player?

I notice your using Expression Engine.. unfortunately, I've no experience using it. If I get bored (heh..) I might make a test site with it. Uhm.. no promises there though.
MarkRH is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 14th, 2007, 6:10 PM   #13 (permalink)
Bringing Sexy Back
Seasoned Poster
 
LissaKay's Avatar
 
Joined in May 2006
Lives in Knoxville, TN
Hosted on SH130
95 posts
Gave thanks: 0
Thanked 5 times
According to my latest visitors log, the server is returning a 304 response to Google Feedfetcher, even though there are several new posts:

*
/index.php/weblog/rss_2.0/
Http Code: 304 Date: Nov 14 16:37:37 Http Version: HTTP/1.1 Size in Bytes: -
Referer: -
Agent: Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 13 subscribers; feed-id=786644724791516834)


All other crawlers/bots are receiving the correct response, either 304 or 200 with an appropriate file size.

=================

PLEASE! There has to be someone that knows what is going on here!

I changed NOTHING in my feed when it all of a sudden stopped working. It had been working fine in all readers before. Now ONLY Google Reader is not updating.

Surpass support has washed their hands of this, and is refusing to look further into the matter, such as the latest response I got after harassing Google endlessly about this. I am seriously considering having them move me to a different server! Or just cancelling all of my accounts and going elsewhere!
__________________
LissaKay.com is on SH130
LissaKay is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 15th, 2007, 5:26 AM   #14 (permalink)
Surpassing Dutch
Super #1
 
Edwin's Avatar
 
Joined in Sep 2004
Hosted on SH98
2,548 posts
Gave thanks: 188
Thanked 45 times
Could it be the server? I'm on the same server as you, but I've got no clue on how to test this.
__________________
sh98
Edwin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 15th, 2007, 4:29 PM   #15 (permalink)
Bringing Sexy Back
Seasoned Poster
 
LissaKay's Avatar
 
Joined in May 2006
Lives in Knoxville, TN
Hosted on SH130
95 posts
Gave thanks: 0
Thanked 5 times
Edwin,

I'm already using your feed to try to prove to the 'tards at Google that it isn't the server that is the problem ... or EE, or anything on this end. It's them and they just won't admit it

Thanks!
__________________
LissaKay.com is on SH130
LissaKay is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 15th, 2007, 6:07 PM   #16 (permalink)
Surpassing Dutch
Super #1
 
Edwin's Avatar
 
Joined in Sep 2004
Hosted on SH98
2,548 posts
Gave thanks: 188
Thanked 45 times
You welcome

Now I should make sure I post some good things
__________________
sh98
Edwin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 16th, 2007, 9:37 AM   #17 (permalink)
Surpassing Dutch
Super #1
 
Edwin's Avatar
 
Joined in Sep 2004
Hosted on SH98
2,548 posts
Gave thanks: 188
Thanked 45 times
Seems like my feed is getting through, but your last is from November 8.

So strange. To me (as newbie in this kind of cases), it looks like it's something with Google.
__________________
sh98
Edwin is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Old November 18th, 2007, 5:20 PM   #18 (permalink)
Bringing Sexy Back
Seasoned Poster
 
LissaKay's Avatar
 
Joined in May 2006
Lives in Knoxville, TN
Hosted on SH130
95 posts
Gave thanks: 0
Thanked 5 times
The latest from Google, which is still being ignored by the help desk ...

I keep forwarding this stuff, but they still fail to do the needful thing

====================

First of all, you can find some background information on Google's
crawl caching proxy here:
http://www.mattcutts.com/blog/crawl-caching-proxy/

You can see there how the system is set up: Google's various crawl
services (web search, blog search, etc.) all share a common cache of
crawled content, to minimize the load we put on people's sites.
However, these services do need to verify that the cached copy of a
page is still current. They do that using "if-modified-since"
requests, which are more light-weight than full content requests. In
the case of your feed, this is what happens:

1. Google Reader checks the crawl cache and finds a recent copy of
your feed.
2. Wanting to make sure it's a current copy, Reader sends your server
an if-modified-since request. (Assume for now that your blog hasn't
been updated, since we crawl more often than you post.)
3. Instead of a 304 ("Not Modified") response, your server sends back
a 200 ("OK") with no content.
4. Reader sees the 200 and assumes this must be the new, up-to-date
version of your feed.
5. Reader therefore ignores the cached copy, and "updates" your feed
with the version it got. Unfortunately, that version was empty, so
there's no change to make.

Some important points about this:
* If we had gotten a 304 in step 3 above, Reader would have used the
cached copy, which probably would have had your recent posts in it.
* If the timing works out such that your blog has indeed been updated
since the last time it was crawled and cached by one of our services,
then the 200 would come back with content and we'd update it in Google
Reader. This explains why you do occasionally see updates. They're
infrequent, though, because there's a relatively narrow time window
relative to our crawls during which you can post and have your server
return a correct response.
* Your feed updates fine in BlogLines and other services because they
presumably don't have this same system set up, with the cached copies
and the "if-modified-since" requests. (If they're only running a
single service, there's less of a need to set up a more complex system
like this.)
* The curl command, referenced several times before, will return
different results depending on the date used. If put far enough in the
past, it will return your content, since you've updated since then.
However, our requests generally have a very recent date/time, so it's
more common for there to be no updates.

To sum it all up, it really is a matter of it all coming down to the
200 vs. 304 responses. If you can get your server to consistently sent
304 responses for unmodified content, then the issue will most likely
go away. For reference, the spec for using 304 response codes can be
found here:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

If you can get this change made and still verify that there are
problems, we'll be happy to look into this again. However, as things
stand now, there is nothing further we can do from our end. I'm sorry
for all the back-and-forth about it, and I know it's a confusing
issue, but hopefully this (admittedly long) explanation helps.

Thanks,
Graham
__________________
LissaKay.com is on SH130
LissaKay is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply


Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On