Brad Fitzpatrick (bradfitz) wrote in lj_dev,
Brad Fitzpatrick
bradfitz
lj_dev

Bot policy

Barring major, well-founded objections, I'm going to enforcing a new policy on bots that scrape the site (userinfo/FOAF/fdata/etc).

The policy is:

If you're scraping, your useragent string must include a contact email. (And ideally a URL of the project)

For instance:

http://fooland.com/ljtoy.html; bob@fooland.com

I'd like to be very bot-friendly, but that requires bots be friendly back.

I'd also like to get up a URL ($LJHOME/bots/ ?) which explains:

-- the rules (user agent, rates)
-- what we provide in machine-readable format
-- who to contact for other access

Then when we block a new bot that hasn't read the rules, the block message will include:

"You've been banned because you seem to be a new bot. Please read the bot rules at $LJHOME/bots/ and contact us to get unblocked."

Sound acceptable?

Please pass word in relevant communities: lj_clients, lj_research, etc. (I don't follow everything.)
Subscribe
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 27 comments
Previous
← Ctrl ← Alt
Next
Ctrl → Alt →
Previous
← Ctrl ← Alt
Next
Ctrl → Alt →