xb95 (xb95) wrote in lj_dev,

Duplicate Post Prevention

As per this post in lj_clients, grimmtooth was asking about how to make clients immune to the problems of users not knowing their post went through on a server burp and then thinking they have to submit it again, sometimes repeatedly, creating duplicate posts. This led me on a crusade to fix this server side. What I have come up with is below.

Basically, the duplock table doesn't get much use now except in payments and support. In the table definition, it specifically supports log entries, it just never got implemented. I implemented it in a hopefully reasonable fashion to prevent duplicate journal entries from being posted.

I created a new userprop titled opt_nodupes. When enabled via editinfo.bml, every post you make is checked against the duplock table to see if it matches the MD5 digest of any other posts you've made in the past hour or so. (Per the schema, duplock clears every hour or so.) If it does, the anum (if clustered) and itemid are extracted from duplock and returned to the caller, indicating to the user that the post went through successfully when in actuality it went through okay previously. (Note: it could just as easily return a duplicate post error, but I figured that if the user is submitting again, they probably didn't know it went through, so why throw them an error, just say it's okay and return what was returned last time.) If there is no matching entry in duplock, one is created set to expire in about an hour.

The following files were changed, and here are links to their patch files:

ljprotocol.pl - Changes to the postevent sub to check for duplicates and update the duplock table.
editinfo.bml - Modified this to give the user the ability to toggle their duplicate post prevention userprop.
en_LJ.dat - Not sure the easiest way to add something to the language setup, but I found that appending the two new items to this file and then doing a bin/texttool.pl wipedb and bin/texttool.pl load worked fine.
create.bml - A one line addition to set the userprop to on by default. I couldn't find a better place to do this, so if this is a bad place, suggest away and I can modify it.
base-data.sql - A modification to this file so that bin/upgrading/update-db.pl will work with populating the new userprop required. If you just want to drop the single SQL command into your handy mysql interface, use this file.
dupposts.tar.gz - A tar gzip of the files, for easy downloading and application.

This is up and running on my goathack if anyone wants to try it. User test (pass: test) has it turned off (i.e. duplicate posts allowed) whereas user blah (pass: blah) has it turned on (no duplicates within an hour or so.) I don't have the cron running to empty duplock, so you will never be allowed to make the same post twice. (And duplock will rapidly fill up...)

Anyway, thoughts? Opinions? Any way I can make this better, or is it fine as is?

Update: Just in case anybody was wondering, only the body of an event is checked for uniqueness. If someone adjusts the subject but leaves the body the same, it will still come back as a duplicate. I figured this would be acceptable, better than trying to concatenate subject, mood, music, body, etc.

Update Again: Per bradfitz he wanted this always on and not to be an option. The updated ljprotocol.pl is here. That is the only file that needs patching.
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded