well, you asked for it
hey doug, hopefully you're actually interested, and weren't just being sarcastic. [smirk] oh, and i get emailed a copy of the comments when people post them, so you can feel free to comment the relevant entry, even if it's old.
if i had known that you were interested in music theory, i probably would have told you about this a long time ago.
the actual process i'm using (after several previous iterations which were somewhat less efficient and/or successful) is relatively simple. it's even recursive, to my everlasting shame. i don't even really remember why i hate recursion so much, but it just seems dirty to me somehow. probably just my innate need to be different from all the other computer scientists.
anyway, assign each of the chromatic notes (yes, all of them, for now) a signed char numeric identifier, centered on zero = middle C. one of the great things about transposition is that you can do it after the fact, and the only rule which would really impact where you transpose it to is the one about being "easily singable by the average musician," which is fairly nebulous anyway (so we discard it for the time being). okay, that means that for our purposes, the tonic is always middle C, and rule number one says you always have to start and end on the tonic, with the penultimate note being degree two.
so we seed the recursion with middle C (0), and branch the tree into as many separate strings as there are legal next steps. which are governed by the next several rules. (i'm doing this off the top of my head, so the list here might not be complete or correct.)
2. length between 8 and 13 notes. gives us a termination condition for the recursion; if there are 6 or more notes already in the string, and you're about to add degree 2, then put it in the "good" pile, otherwise, if there are 11 or more notes in the string, and you haven't put it in the "good pile" yet, throw it out.
3. only the notes which are part of the mode are allowed, so if you try to add one that isn't, cull the string.
4. valid intra-melodic intervals are: M/m2nds, M/m3rds, P4ths, P5ths, m6ths(asc), P8ves. try adding each of those, ascending and descending (16 options), and let the modal filter from rule 3 above narrow it down from there.
5. actually, if the previous step was a leap greater than a 3rd, force a stepwise recovery in the opposite direction (two options, one of which fails the modal filter).
6. and if the previous step was a leap, but not greater than a third, we disallow successive leaps in the same direction (only 10 options passed to the filter).
i'm pretty sure that's where i ended phase one, and just let it crunch through all of those possiblities. then, i added a phase two check on the way to the good pile, to weed out even more, based on other rules:
7. unique climax, melodically constant with the tonic (M/m3rd, P4th, P5th, M/m6th, P8ve, M/m10th).
8. total range limited to never more than a 10th (a rather extreme case; many known cantus fermi restrict themselves to a 5th or 6th).
9. there should generally be "three or four judiciously employed leaps," so i throw out any strings with more than five. (i could be more strict if i wanted, but we're getting into the more subjective rules now, and the number of results seems managable, so why the hell not?)
after analyzing a fair number of these results by hand, i put a few more "subjective" constraints into the phase two check (but still fairly lenient; just trying to throw out things which are clearly wrong).
10. bound on total leap distance. i don't remember the exact number i used (the code is at home right now). avoids patently ridiculous things like having all four leaps be octaves or something similar.
11. minimum and maximum number of direction changes. again, don't have exact figures on me.
12. embargo on repetition of groups or sequences. was actually one of the first things i did, but for some reason i didn't think about it until now. in fact, i suspect this may be where the majority of the processing time is spent (though even on my crappy computer, the whole process of enumerating one of the modes only took about 20-25 minutes, the longest time i remember). currently, i'm suppressing any strings that have any substring of four consecutive notes occurring more than once anywhere in the string. there are some examples i've found in the output which lead me to believe i might eventually want to suppress repeated substrings of length three as well, but for the moment, i'm erring on the side of inclusion.
and i'm pretty sure that's where i left off. after all the aforementioned pruning, there were approximately ~500,000 entries remaining for each of the seven modes (that's where the ~3.5 million comes in). unfortunately, i've been rather busy lately, so haven't really touched it since that last entry, and like i said, the next step is getting it all fed into a database, so i can treat it further in a persistent manner, instead of re-generating or reading in and parsing multiple 300MB text files.
really the coolest thing so far, though, is the way i'm able to automatically generate real sheet music using ABC. that's where the example image came from. once i get this finished, i kinda wanna make a website, where people can (well, "could," if anyone ever looked at it) browse the resulting database, and possible give asthetic scorings to annotate the records...
ps: please don't steal my idea. it's taking me a long time since i never seem to have much if any free time, and it would make me very sad if somebody else beat me to the punch...