insomniphilia

lots to do at night.
last.fm profile
cooking blog

Posts tagged code

Jul 17 '11

Marking duplicate files with ipython

I use (and love) Notational Velocity for storing and quickly retrieving notes on OS X. Recently, I accidentally enabled both SimpleNote and DropBox syncing on two different computers, which resulted in many duplicate notes. Some quick IPython hacking fixed the problem.

Notes are stored in files like Title.rtf, and duplicate notes ended up in files like Title.1.rtf. I first found and extracted all the base filenames, and counted how often each occurred:

files = !ls
R = re.compile(r'(.*?)(\.\d+)?.rtf')
m = map(R.match, files)
c = collections.Counter([mm.group(1) for mm in m
    if mm is not None])

Next, I created a list of all the extra files (I was lazy, and used glob for this, rather than using my original list of filenames), and used the command line to color-label the extra files in Finder (based on this hint):

extra_files = [glob.glob(fn + ".*.rtf")
    for fn, count in c.iteritems() if count > 1]
extra_files = reduce(list.__add__, extra_files)
for fn in extra_files:
    !osascript -e "tell application \"Finder\" to set label index of alias POSIX file \"$fn\" to 1"

Voilà! The duplicate files were all labeled orange. I went through them using QuickView, to make sure I wasn’t deleting anything important, and cleaned up my Notational Velocity directory.

2 notes View comments Tags: code