Edit: More Info in another post but here it is
Well talked this over with Brad and we still aren't really happy with solutions.
Basically we are asking ourselves how important are changes made on the site versus those in the dat file. Syncing those changes back is really a pain since then they will also have to get commited to the CVS.
So the new proposal is that if a string is changed in the dat file and it already exists on the site it gives an error depending on the priority of the change. This still doesn't seem to make sense as if text gets completly changed around things won't be right either. I mean a small spelling fix could not overwrite the db, as it was already caught, but a big change of rewording a sentance or paragraph a dev may want to force. Then again updating something big may be bad too.
Then my next idea was comparing percentage of string changed but this also seems to fail. The percentage may be large either because the string was changed on the site in terms of order, means don't overwrite it, or because the new string is just drastically different. So in order to make this work the percentage needs to be between the new string and the old string in the dat file. So this means keeping around copies of the old strings somewhere.
Another idea, making it so a change to en or en_LJ requires a dev to confirm it where they would also make the change in the dat file and then commit it to the CVS. Then strings could be just overwritten by texttool load for en and en_LJ since what is live on the site and in the dat file would be the same. Downside here is it adds another step to get a change made and the initial process for this could be bad since there may be a lot of strings which are different in the dat file from the site. This is however a method that requires the least ammount of work when updating the sites code as texttool can just overwrite changed strings and thus you don't need to worry about conflicts. So the question here is how often are changes made to en and en_LJ?