....Such as when Google is using a deepbot, or what Google is doing recently? Is it based on observation or log files? ...
Hi JoJoH.

didn't notice the query was from you when I first replied.
A potted version of the way Google's spider/update system has worked until recently:
In your log files you'll notice 'Googlebot' showing up. Googlebot came from a range of IP addresses.. or two ranges actually.
Previously, if G-bot came from a 64.*** IP number that was Freshbot come a-callin' (Freshy tended to call throughout the month and be somewhat erratic, and often alarmed new webmasters because she would take a page here and there or part of a site, or one level of a site, but not the whole lot.) Freshy favoured sites like news sites with frequently updating content. Newer sites often received semi regular visits from Freshy for a couple of months until she had figured out where the new site fit - ie, frequently changing content so keep visiting regularly.. or content changes rarely so a monthly deep visit is adequate. Freshy has typically also been the cause of queries from webmasters about new sites - "Google was listing my new site yesterday and today it's gone!" - this is because Freshy picked up the pages and put them in the index for a couple of days, but a site or page wasn't typically 'cemented' into the index until it had been through a deepcrawl and full update. Freshy tended to disappear a day or two before Deepbot started her monthly deep crawl and reappear once the deep crawl was nearly done.
Deepbot came from a 216.*** IP address and basically recrawled all Google's indexed pages plus any new stuff that Freshy had found. As the web grew so did the time the deepcrawl took. Around a week by early this year, with dribs and drabs still being spidered up to around ten days after starting. Deepcrawled pages weren't updated in the public index immediately, but were stored in the background while the Google engineers did their thing and tested and tweaked the new algorithm. The crawl cycle was roughly monthly.. so about 3-4 weeks after the deepcrawl started, the update would start, and the public index would fluctuate for several days as the new index was rolled out across the data centers. Towards the end of the update PR would be updated.
............
That's it in a nutshell. As a webmaster, once you started to understand that cycle Google became a lot less intimidating, as you could fairly predict what was happening with your listings.
In about the last three months G has made some major changes. A Google representative has confirmed that we won't likely see activity from the former deepbot IP range.. ie, Googlebot will be coming from a 64.*** address. Basically they are moving towards a rolling update. At this it is still unclear quite what the new cycle is, but I expect it to be a much more Freshy-type behaviour, with pages entering the index within days of spidering (and presumably staying in the index, as there will be no deep crawl to cement them). The bigger mystery is probably how will PR be handled.. it hasn't been reliable since the new system started, and it is unclear quite what current PR is based on for and how often/when it will be updated.
It's all a bit exciting really.
.....
As to how I know all this... iffun I told ya, i'd have to shoot ya!

j/k
I'm a member of another forum, which I won't mention here as I don't approve of cross-forum recruiting. It's a dedicated webmaster/SEO forum and the information above is from combined membership information and my experiene, backed up by my various sites' logs.