Introduction to ScoreDraft

The soure-code of ScoreDraft is hosted on GitHub  https://github.com/fynv/ScoreDraft, where you can always find the latest changes that I have made.

In the sub-directory https://github.com/fynv/ScoreDraft/tree/master/python_test, I put the binary prebuilt for 64bit Windows and Linux, also a deployment of the Python scripts.

This document will introduce the uses of each basic elements of ScoreDraft.

HelloWorld Example (using TrackBuffer)

Let’s start from a simplest example to explain the basic usage and design ideas of ScoreDraft.


import ScoreDraft
from ScoreDraft.Notes import *

seq=[do(),do(),so(),so(),la(),la(),so(5,96)]

buf=ScoreDraft.TrackBuffer()
ScoreDraft.Piano().play(buf, seq)
ScoreDraft.WriteTrackBufferToWav(buf,'twinkle.wav')

Play Calls

 
ScoreDraft.Piano().play(buf, seq) 

As the most important interface design of ScoreDraft, “Play Calls” is a class of commands of the form
instrument.play(buf,seq), which simply means, play sequence “seq” with the instrument and write the result to “buf”.
Similarly, you can also use a percussion group to play or use a “singer” to sing. We will use the the term
“Play Calls” to refer to any command of these sorts.

We will sometimes pass in tempo and reference frequency parameters into a Play Call. These typically has default
values and are not compulsory.

Imports

 import ScoreDraft
 from ScoreDraft.Notes import *

The first thing to do is to import “ScoreDraft”  package, which provides the core Python interfaces of ScoreDraft.

Most application code will also import the note definitions from “ScoreDraft.Notes” module. However, the fact is that the core interface of ScoreDraft does not include any specific definition of musical notes. The core interface simply accepts a physical frequency f_ref in Hz as a reference frequency for each  Play Call, and a  relative frequency f_rel[i], which is just a multiplier, for each note. The physical frequency of each f_note[i] can be then calculated by  f_note[i]=f_ref *f_rel[i]. In the note definition module ScoreDraft.Notes, a bunch of note functions do(), re(), mi(), fa() are defined to convert musical language to physical numbers. These functions are really simple in nature, thus allowing user to easily modify or extend for special purposes, such as when an alternative tuning (other than 12-equal-temperament) is desirable.

Score representation

 seq=[do(),do(),so(),so(),la(),la(),so(5,96)]

The score itself is represented as a set of Python lists, called sequences. How theses sequences are formed will be explained in the succeeding sections.
The elements of the sequences are processed consecutively, but generated sound can overlap with each other by containing backspace operations in the sequences.
Because the sequences are just Python lists in nature, the full function set of Python can be utilized to automate the score authoring work. Explaining those tricks may require a separate document.

TrackBuffer

ScoreDraft uses track-buffers to store wave-forms. A track-buffer can be used as either an intermediate storage for synthesis result or the final buffer for the mix-down result of several intermediate buffers.

The package ScoreDraft provides a class TrackBuffer, which is a direct encapsulation of the C++ interface. Comparing to the class Document (to be introduced later), class TrackBuffer is a low-level interface.

HelloWorld Example (using Document)

 import ScoreDraft
 from ScoreDraft.Notes import *
 
 doc=ScoreDraft.Document()
 
 seq=[do(),do(),so(),so(),la(),la(),so(5,96)]
 
 doc.playNoteSeq(seq, ScoreDraft.Piano())
 doc.mixDown('twinkle.wav')

Most musical pieces need multiple track-buffers. The class ScoreDraft.Document is provided as an unified track-buffer manager, and it is more recommended than using  ScoreDraft.TrackBuffer directly.

As shown in the above example, using class Document is accompanied by some changes in the writing style of the Play Calls. When issuing a Play Call through the class Document, the target track-buffer is always implicated, so a parameter is not necessary anymore. At the same time, the instrument used for playing becomes a parameter. The format now looks like doc.play(seq,instrument) instead of instrument.play(buf,seq).

This have a few benefits. First, it simplifies the creation of track-buffers, the document object can do that for you implicitly during Play Calls. Second, it largely simplifies the mixdown call. You don’t need to enumerate all the track-buffer to be mixed when they are managed insides the document object. Third, visualization component can exploit polymorphism by replacing the Document class with an extended version. For example, the Meteor visualizer can be enabled with minimal effort by replacing “ScoreDraft.Document” with “ScoreDraft.MeteorDocument”.

The class Document also manages tempo and reference frequency parameters internally, so we don’t pass them through the Play Calls anymore in this case.

Initialization of Intruments/Percussions/Singers

User can always run the PrintCatalog.py to get a list of all available instrument/percussion/singer initializers. The output will look like:

{
  "Engines": [
    "PercussionSampler - Percussion",
    "InstrumentSampler_Single - Instrument",
    "InstrumentSampler_Multi - Instrument",
    "KeLa - Singing",
    "UtauDraft - Singing"
  ],
  "Instruments": [
    "Ah - InstrumentSampler_Single",
    "Cello - InstrumentSampler_Single",
    "CleanGuitar - InstrumentSampler_Single",
    "Lah - InstrumentSampler_Single",
    "Lah - InstrumentSampler_Multi",
    "String - InstrumentSampler_Single",
    "Violin - InstrumentSampler_Single"
  ],
  "Percussions": [
    "BassDrum - PercussionSampler",
    "ClosedHitHat - PercussionSampler",
    "Snare - PercussionSampler"
  ],
  "Singers": [
    "GePing - KeLa",
    "KeLaTest - KeLa",
    "Up - KeLa",
    "GePing_UTAU - UtauDraft"
  ]
}

The first list “Engines” gives a list of names of available engines and the type of engine (is it an instrument or percussion or singing engine?). You can find the actual definition of these engine classes by their names. For example, the ScoreDraft.PercussionSampler class is defined in ScoreDraft/PercussionSampler.py. These classes can be used to create instruments/percussions/singers directly. However, it is typically needed to provide a path to the sample data and other information.

The 3 succeeding lists gives names of ready-to-use instrument/percussion/singer initializers and the engines they based on. ScoreDraft creates these initializers automatically at starting time using pre-deployed samples. The definitions of the initializers are dynamic code blocks, you cannot find them in the source-code. However, using them is simple.  For example you can initialize a Cello instrument by:

Cello1= ScoreDraft.Cello()

Instrument Sampler

The instrument sampler engine uses 1 or multiple wav files as the input to create an instrument. The .wav files must have 1 or 2 channels in 16bit PCM format. The algorithm of the instrument sampler is by simply stretching the sample audio and adding a envelope. So be sure that the samples have sufficient lengths.

Single-sampling

You can use the class ScoreDraft.InstrumentSampler_Single to create an instrument directly. At creation time, you need to provide a path to the wav file. The path can be either absolute path or relative to the starting folder.

flute = ScoreDraft.InstrumentSampler_Single('c:/samples/flute.wav')

For pre-deployment, just put wav files into ScoreDraft/InstrumentSamples.  The file name without extension will be used as the name of the instrument initializer, which will be shown in the PrintCatalog lists.

Multi-sampling

You can use the class ScoreDraft.InstrumentSampler_Multi to create an instrument directly.  At creation time, you need to provide a path to a folder containing all the wav files of an instrument.  The audio samples should span a range of different pitches. The sampler will generate notes by intepolating between the samplers according to the target pitch.

guitar = ScoreDraft.InstrumentSampler_Multi('c:/samples/guitar')

For pre-deployment, first create a sub-folder in ScoreDraft/InstrumentSamples,  whose name will be used as the name of the instrument initializer,  then put multiple .wav files into the new created folder.

SoundFont 2 Instruments

ScoreDraft now supports initializing instruments using SoundFonts. The class ScoreDraft.SF2Instrument is used for interfacing SoundFonts. To create an instrument with  ScoreDraft.SF2Instrument, you need to provide a path to the .sf2 file and the index of the preset you want to use.

 
piano = ScoreDraft.SF2Instrument('florestan-subset.sf2', 0)

The function ScoreDraft.ListPresetsSF2() can be used to obtain a list of all available presets in a .sf2 file:

 
ScoreDraft.ListPresetsSF2('florestan-subset.sf2')

Pre-deployment is also supported for SF2. User can put a .sf2 file into ScoreDraft/SF2. The file name without extension will be used as the name of the instrument initializer. Because we need to know which preset to use, a preset_index parameter is still need when calling the initializer.

SoundFont 2 support is based on a porting of TinySoundFont. Here I acknowledge Bernhard Schelling for the work!

Percussion Sampler

The percussion sampler engine uses 1 wav file as the input to create a percussion. The .wav files must have 1 or 2 channels in 16bit PCM format. The algorithm of the percussion sampler is by simply adding a envelope. So be sure that the samples have sufficient lengths.

You can use the class ScoreDraft.PercussionSampler to create a percussion directly. At creation time, you need to provide a path to the wav file.

drum = ScoreDraft.PercussionSampler('./Drum.wav')

For pre-deployment, simply drop 1 wav file into ScoreDraft/PercussionSamples. The file name without extension will be used as the name of the percussion initializer. You can call it directly without any parameter.

KeLa Engine

The KeLa engine is a simple singing engine that directly uses a folder of wav files as a voicebank to create a singer. The .wav files must have 1 channel in 16bit PCM format. Currently, conversion between different sample rates is not taken care of. So using non-44100 sampled .wav file will result in unexpected result. Unlike the instrument sampler and the percussion sampler, KeLa engine takes in short pieces of audio samples, extract features and use the features to generate voice. So you don’t need to use long audio samples, just try to sing a flat pitch during recording.

You can use the class ScoreDraft.KeLa to create a singer directly. At creation time, you need to provide a path to the folder used as voicebank.

jinkela = ScoreDraft.KeLa('d:/jinkela')

For pre-deployment, you need to put a subfolder of wav files into  ScoreDraft/KeLaSamples. Each subfolder of “KeLaSamples” defines a singer initializer. The sub-folder names are used as the name of the singer initializers. Each sub-folders contains multiple wav files. The file names without extension are corresponding to the lyric strings.

UtauDraft Engine

The UtauDraft engine uses a UTAU voicebank to create a singer.

You can use the class ScoreDraft.UtauDraft directly. At creation time, you need to provide a path to the UTAU voicebank, and optionally a bool value indicating whether to use CUDA acceleration or not. The default is use CUDA acceleration when available. Pass in a “False” to disable it without attempting.

cz = ScoreDraft.UtauDraft('d:/CZloid', False)

For pre-deployment, you can put a UTAU voice-bank directly into the ScoreDraft/UTAUVoice folder. Each subfolder of UTAUVoice defines a singer initializer. The sub-folder names with “_UTAU” endings are used as the name of the singer initializer. For example, the sub-folder “GePing” will define a singer initializer GePing_UTAU(). If the original sub-foler name is unsuitable to be used as an Python variable name, then you should rename it to prevent a Python error. A dynamically generated initializer also has an option of whether use CUDA or not. You only need to use it when you want to disable CUDA.

Instrument Play

The kind of sequence used for instrument play is called note sequence. Note sequences are Python lists consisting of tuples in (rel_freq, duration) form, where “rel_freq” is a float and “duration” an integer.

Example:

 seq=[(1.0, 48), (1.25, 48), (1.5,48)]

With an existing document object “doc”, you can “play” the sequence using some instrument like following:

 doc.playNoteSeq(seq, ScoreDraft.Piano())

The float “rel_freq” is a relative frequency relative to the reference frequency stored in the document object, which can be set with doc.setReferenceFreqeuncy(), and defaulted to “264.0”(in Hz).

The duration of a note is “1 beat” when the integer value “duration” equals 48. The document objects manages a tempo value in beats/minute, which can be set using doc.setTempo(), and defaulted to 80. Not it is also allowed to feed doc.setTempo() with a series of control points, which builds a Dynamic Tempo Mapping, to be discussed later.

When “ScoreDraftNotes” package is imported, we can write the note sequences in a more musically intuitive way:

 seq=[do(5,48), mi(5,48), so(5,48)]

The note functions have intuitive names (do(),re(),mi(),fa(),so(),la(),ti()), and they take in 2 integer parameters, octave and duration. The return values are tuples. While the duration parameter is directly passed to the duration component of the returned tuple, the rel_freq component of the tuple is decided by the octave value plus the note function itself. The default octave is 5, which means the center octave. For example, the returned rel_freq of “do(5,48)” will be “1.0”, and rel_freq of “do(4,48)” will be “0.5”.

When rel_freq<0.0, ScoreDraft will treat the note as some special marker, depending on whether duration>0 or duration <0. When duration>0, it means a rest. When duration<0, it means a backspace. “ScoreDraftNotes” provides 2 functions “BL(duration)” and “BK(duration)” to formalize these uses. Backspaces are very useful, because when cursor moves backwards, the next notes will be possible to overlap with the previous notes, making representation of chords possible. For example, a major triad can be written like:

 seq=[do(5,48), BK(48), mi(5,48), BK(48), so(5,48)]

 

Percussion Play

For percussion play, first you should consider what percussions to choose to build a percussion group. For example, I choose BassDrum and Snare:

	BassDrum=ScoreDraft.BassDrum()
	Snare=ScoreDraft.Snare()	
	perc_list= [BassDrum, Snare]

The kind of sequence used for percussion play is called beat sequence. Beat sequences are consisted of tuples in (percussion_index, duration) form. Both “percussion_index” and “duration” are integers, where “percussion_index” refers to the index in the “perc_list” defined above, and “duration” is the same as instrument play.

Often, we want to defines some utility functions to make the writing of beat sequences more intuitive:

	def dong(duration=48):
		return (0,duration)
	
	def ca(duration=48):
		return (1,duration)

Now you can use the above 2 functions to build a beat sequence like:

	seq = [dong(), ca(24), dong(24), dong(), ca(), dong(), ca(24), dong(24), dong(), ca()]

With an existing document object “doc”, you can “play” the sequence using “perc_list” like:

	doc.playBeatSeq(seq, perc_list)

 

Singing

ScoreDraft provides a singing interface similar to instrument and percussion play. The kind of sequence used for singing is called singing sequence. A singing sequence is a little more complicated than a note sequence. For example:

seq = [ ("mA", mi(5,24), "mA", re(5,24), mi(5,48)), BL(24)]
seq +=[ ("du",mi(5,24),"ju", so(5,24), "rIm", la(5,24), "Em", mi(5,12),re(5,12), "b3", re(5,72)), BL(24)]

Each singing segment contains one or more lyric as a string, each followed by 1 or more tuples to define the pitch corresponding to the leading lyric. In the simplest case, one of the tuples can be the same form as an instrument note: (freq_rel, duration), which was the only form supported before.

Now, a latest extension is that you can put multiple control points into a tuple, such as (freq_rel1, duration1, freq_rel2, duration2, …). Pitches will be linearly interpolated between control points. The last control point defines a period of flat pitch. Pitches are not interpolated between tuples. Under this extension, you can define a piece-wise pitch curve by concatenating multiple instrument notes, like do(5,24)+so(5,24)+do(5,0), which defines a pitch curve of 3 control points and a total duration of 48.

All lyrics and notes in the same singing segment are intended to be sung continuously. However, when there are rests/backspaces, the singing-segment will be broken into multiple segments to sing. The singing command looks like following, with an existing “doc” and some singer:

doc.sing(seq, ScoreDraft.TetoEng_UTAU())

You can also mix raw notes without lyric with the singing segments. In that case, these notes will be sung using a default lyric. Vice-versa, if you try playing a singing sequence with an instrument, the notes in the sequence will get played ignoring the lyrics, but note that the extended form is not supported.

The singing interface also supports rapping. Originally, the rapping is written in a specialized grammar, which looks like:

seq= [  ("kan", 48, 1.0, 0.5, "jian", 24, 0.75, 0.55, "de", 24, 0.75, 0.55, "kan", 24, 1.0, 0.5, "bu", 24, 0.75, 0.55,"jian", 24, 0.75, 0.55, "de", 24, 0.75, 0.55  ) ]

In the above sequence, there are 3 numbers following each lyric. The first one is the duration of the syllable, which is an integer. The 2 floats following are the starting and ending frequencies of the syllable.

Now, we have the extension of the piece-wise formed pitch curve, the functions of the old rapping grammar, which uses 2 control points, is well covered. Therefore, we can use note concatenation to write rap too. An example using such writting style is Tik Tok

There is an utility CRap() defined in ScoreDraftRapChinese.py to help to generate the tones of Mandarin Chinese (the 4 tones). An example using CRap():

seq= [ CRap("chu", 2, 36)+CRap("he", 2, 60)+CRap("ri", 4, 48)+CRap("dang", 1, 48)+CRap("wu", 3, 48), BL(24)]

 

KeLa Engine

The KeLa engine is a simple singing engine. It synthesizes by streching individual audio samples into individual notes, which is similar to 単独音 in UTAU. When using KeLa Engine, there’s no difference whether writing 1 syllable as a singing segment or putting multiple syllables into a singing segment.

UtauDraft Engine

The UtauDraft Enigine tries to be compatible with all kinds of UTAU voice-banks, including 単独音,連続音, VCV, CVVC as much as possible. oto.ini and .frq files will be used to understand the audio samples. prefix.map will also be used when one is present.

When using UtauDraft Engine, for 単独音, you can use the names defined in oto.ini as lyrics, just like in UTAU.

For other types of voicebanks, in order to tackle transitions correctly as well as simplifying the lyric input, user should choose one of the lyric-converters to use. Currently there are:

ScoreDraft.CVVCChineseConverter: for CVVChinese
ScoreDraft.XiaYYConverter: for XiaYuYao style Chinese
ScoreDraft.JPVCVConverter: for Japanese 連続音
ScoreDraft.TsuroVCVConverter: for Tsuro style Chinese VCV
ScoreDraft.TTEnglishConverter: for Delta style (Teto) English CVVC
ScoreDraft.VCCVEnglishConverter: for CZ style VCCV English

For setting lyric converter just call singer.setLyricConverter(converter), for example:

import ScoreDraft
Ayaka = ScoreDraft.Ayaka_UTAU()
Ayaka.setLyricConverter(ScoreDraft.CVVCChineseConverter)

For CZ style VCCV, you need one more call: singer.setCZMode() to let the engine use a special mapping method.

The converter functions are defined in the following form, write your own if the above converters does not meet you requirements:

def LyricConverterFunc(LyricForEachSyllable):
	...
	return [(lyric1ForSyllable1, weight11, isVowel11, lyric2ForSyllable1, weight21, isVowel21...  ),(lyric1ForSyllable2, weight12, isVowel12, lyric2ForSyllable2, weight22, isVowel22...), ...]

The argument ‘LyricForEachSyllable’ has the form [lyric1, lyric2, …], where each lyric is a string, which is the input lyric of a syllable.

The converter function should convert 1 input lyric into 1 or more lyrics to split the duration of the original syllable. A weight value should be provided to indicate the ratio or duration of the converted note. A bool value “isVowel” need to be provided to indicate whether it contains the vowel part of the syllable.

Dynamic Tempo Mapping

Dynamic tempo mapping is used to accurately define the timeline position and tempo of the generated sound.

It works by replacing the tempo term of a Play Call with a Python list of the following form:

tempo_map=[(beat_position_1, sample_position_1), (beat_position_2, sample_position_2), …]

beat_position_i is a integer that represent a position in the input sequence. It has the same unit as the duration term, where 1 beat is represented by 48.

sample_position_i is a floating point that represent a absolute position on the destination timeline. Its unit is PCM sample.

When there is beat_position_1=0, the starting point of the generated waveform will be aligned with sample_position_1.

When there is not beat_position_1=0, the starting point of the generated waveform is decided by the current cursor position of the destination track buffer.

For beat_position_i, it is suggested to calculate it by calling ScoreDraft.TellDuration(seq), which measures the length of the sequence seq.

For sample_position_i, we often need to measure the target material (audio/video) that we are aligning to, manually.

Example:


seq=[do(),do(),so(),so(),la(),la(),so(5,96)]

buf = ScoreDraft.TrackBuffer()

piano = ScoreDraft.Piano()

tempo_map = [ (0, 44100.0), (ScoreDraft.TellDuration(seq), 220500.0) ]

piano.play(buf, seq, tempo_map)

The about code will generate the sound accurately aligned to the 1s ~ 5s (44100~220500) period.

A bigger example can be found at:
Doraemon Theme Song (An align&mix demo)

Visualization

ScoreDraft now contains 2 Qt5 based visualization extensions.

QtPCMPlayer

QtPCMPlayer can be used to visualize PCM data in track-buffers. 2 interfaces are provided:

  • ScoreDraft.QPlayTrackBuffer(buf): playback a track-buffer, where “buf” is the TrackBuffer object to be played.
  • ScoreDraft.QPlayGetRemainingTime(): returns the remaining time in seconds

Note that QPlayTrackBuffer() is an asynchorized call, which means that it will return immediately to execute the succeeding Python code after the playback is started. You can continue to submit more track-buffers to be played-back. The track-buffers will be queued and played-back consecutively.

Meteor

Meteor can be used to visualize all kinds of sequences while playing-back the mixed track. The easiest way to use Meteor is to use ScoreDraft.MeteorDocument instead of ScoreDraft.Document. The definition of ScoreDraft.MeteorDocument contains all interface as the one defined in ScoreDraft.Document, plus an extra method MeteorDocument.meteor(chn=-1). If you are using ScoreDraft.Document in your old project, you just need to use Meteor.Document to replace it, and call doc.meteor() at the end of the code, the visualizer will thus be activated. Unlike QPlayTrackBuffer(), doc.meteor() is a synchronized call. The execution will be blocked until the end of play-back.

The Meteor visualizer now has a JavaScript port that can be integrated into web pages. This site is using it a lot for demoing. See here for the details.

Exporting to MIDI

Note sequences can be exported to MIDI files. This function has been there for long, but I didn’t mention it, because the work is half-way done.

The extension MIDIWriter provided a function “ScoreDraft.WriteNoteSequencesToMidi(seqList, tempo, refFreq, fileName)”. “seqList”: is a list consisted of multiple note sequences. “tempo” is an integer like the ones used in Play Calls (Dynamic Tempo Mapping is currently not supported). “refFreq” is a float like the ones used in Play Calls. “fileName” is a string. It is half-way done in that:

  • It only supports note sequences, don’t try it with other kinds of sequences.
  • The exported MIDI files consist only note events, so most MIDI players cannot play it correctly (Audacity can play them though). They are intended to be imported to other music software for further processing.

An example using MIDIWriter is contained in FlyMeToTheMoon_eq.py:

ScoreDraft.WriteNoteSequencesToMidi([seq1, seq2], 120, 264.0 *1.25, "FlyMeToTheMoon.mid")