Nav: (Display/Hide) - Home - About the Author / this page

Current Projects: Americana Engine (Game Engine Development)

Friday, June 25, 2010

Music Sorting Challenge

#130E: Create an algorithm that sorts a music library in a hard drive. (Goal Completed, but decided to manually sort)

EDIT 2: Found out that when exporting the directory list from the command prompt to a plaintext file, it won't export special characters correctly, which explains why some files were left in the root directory. Proof: Inputting the text output back into the command prompt produces an error. Sorry, didn't put quotes around it in this image, but the same error occurs when you put it in as well.

EDIT: With the knowledge of the command line based "move" command the rename operation can be skipped altogether. Both renaming and moving it to the correct folder can be done in one step. The music files to sort still needs to be moved to a separate folder for it to work. (Note that I added parentheses around the folder; when you sort by name the folder appears at the top of the window.)

Note: Pictures to be added later.

For those that noticed that the supposed name of the hard drive for this goal was leaked, yes, that is the descriptive name of the drive, but the proper name is KE MHD. Wonder how that would fare had I brought the drive to a friend's house... it has happened once with my flash drive.

As for the algorithm, I managed to complete this. But creating an algorithm doesn't do anything. In the real world, results are needed. I was given a set of 25,000 (yes, 25 thousand) music files all in one directory and I've been tasked to sort them into folders to make it easier to browse. Plus, give them shorter file names because even though a file can be long and have all sorts of descriptive information, it can run into problems when moving them into other folders, particularly experiencing an error if the path length goes above 255 characters.

So I got a copy of the files (no messing with the originals) for processing.

First thing's first. Since it'll take a long time to code that algorithm, I decided to do a generate filelist script and then import that into excel, adding and stripping tabs as needed. I used Notepad++ for this job since it updates the filelisting whenever it changes. My filelist script is as follows:

cd %1
dir /a:-d /b /s > "C:\filelist.txt"
This is placed under the folder context menu. It essentially generates a full pathlist of each of the files in the directory. Note: the 'cd' command first to get into the directory that was right-clicked on. And now, to rename the files. Due to the way the files are named, I generated an algorithm for this specific set of music files; your results may vary. The steps I took were:
  • Prior to doing the below procedure, ensure that you have the music files in their proper artist/album or compilation album folder. If it's a compilation album, the order is typically the track number first to get the correct order when sorting alphabetically. If it's only one artist, then the artist comes first, then the track number. It's helpful when you copy some songs into another folder (examples: flash drive or movie making), where you can identify the artist and song easily.
  • Generate the filelist (this is the full directory, which is necessary when renaming files from the commandline)
  • Take the filelist and copy it into a section into Excel
  • Replace the dash with a tab. By doing this in notepad and copy-pasting this separates the different values into separate cells making it easier to manipulate them.
  • (Optional) Replace the backslash with a tab. Then when you import it in excel, it's easier to remove.
  • Put the artist, track, and song in their respective fields
  • The track field, on custom cell properties, use 00 (it will force two digits to be displayed, 9 becomes an 09). Although Windows orders it correctly if the track number's first without the leading 0, the order will jumble if the artist name is first.
  • Copy the entire excel cell field range, paste into notepad, remove all tabs, save as a batch file
  • run the batch file
  • mark the folder as 'compressed' and repeat (optional, used for marking which folder is done)

Of course the list is quickly hand checked for misalignments because there can be a few exceptions to that rule where the fields are misaligned and the script won't work properly. (such as if the album had a dash in it, or the song did, etc.) The 'compressed' option is used for labeling purposes to mark which ones have already been processed (if I need to resume at a later date).

Later on, I found that they could be grouped into two folders: Compilation Albums and Single-Artist Albums. With that in mind, the only step to take was to put the songs in their proper folders and rename them all at once.

Approximately five days later, The goal was finished. However a few more days were spent correcting a few errors.