musisep.dictsep package¶

Submodules¶

musisep.dictsep.main module¶

Wrapper for the dictionary learning algorithm. When invoked, the audio sources in the supplied audio file are separated.

musisep.dictsep.__main__.correct_signal_length(signal, length)[source]¶

Right-pad or right-crop the signal such that it fits the desired length.

Parameters

signal (ndarray) – Signal to be adjusted
length (int) – Desired length of the signal

Returns

Adjusted signal

Return type

ndarray

musisep.dictsep.__main__.main(mixed_soundfile, orig_soundfiles, out_name, out_name_run_suffix='', inst_num=2, tone_num=1, pexp=1, qexp=0.5, har=25, sigmas=6, sampdist=256, spectheight=6144, logspectheight=1024, minfreq=20, maxfreq=20480, runs=10000, lifetime=500, num_dicts=10, mask=True, color=False, plot_range=None, spect_method='pursuit', supply_dicts=None, spect_plots=())[source]¶

Wrapper function for the dictionary learning algorithm.

Parameters

mixed_soundfile (string) – Name of the mixed input file
orig_soundfiles (list of string or NoneType) – Names of the files with the isolated instrument tracks or None
out_name (string) – Prefix for the file names
out_name_suffix (string) – Extra label for the output files
inst_num (int) – Number of instruments
tone_num (int) – Maximum number of simultaneous tones for each instrument
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
har (int) – Number of harmonics
sigmas (float) – Number of standard deviations after which to cut the window/kernel
sampdist (int) – Time intervals to sample the spectrogram
spectheight (int) – Height of the linear-frequency spectrogram
logspectheight (int) – Height of the log-frequency spectrogram
minfreq (float) – Minimum frequency in Hz to be represented (included)
maxfreq (float) – Maximum frequency in Hz to be represented (excluded)
runs (int) – Number of training iterations to perform
lifetime (int) – Number of steps after which to renew the dictionary
num_dicts (int) – Number of different dictionaries to generate and train
mask (bool) – Whether to apply spectral masking
color (bool or string) – Whether color should be used, or specification of the color scheme
plot_range (slice or NoneType) – Part of the spectrogram to plot
spect_method (string) – If set to “mel”, a mel spectrogram is used for separation. Otherwise, the log-frequency spectrogram is generated via sparse pursuit.
supply_dicts (NoneType or list of array_like) – Is specified, use the given dictionaries rather than computing new ones
spect_plots (sequence of int) – Time frames for which to output the spectrum as a text file

Returns

inst_dicts – Dictionaries that were used for the separation

Return type

list of ndarray

musisep.dictsep.__main__.separate_duan()[source]¶

Separation of the data by Duan et al.

Parameters: number (int) – Number of the sample to be considered.

musisep.dictsep.__main__.separate_frere_jacques()[source]¶: Separation of Bb tin whistle and viola and generalization to C tin whistle and violin, then vice versa.

musisep.dictsep.__main__.separate_jaiswal(number)[source]¶

Separation of the data by Jaiswal et al.

Parameters: number (int) – Number of the sample to be considered.

musisep.dictsep.__main__.separate_mozart_clarinet_piano()[source]¶: Separation of clarinet and piano on the piece by Mozart

musisep.dictsep.__main__.separate_mozart_piano_mock()[source]¶: Mock separation of the piano track.

musisep.dictsep.__main__.separate_mozart_recorder_violin()[source]¶: Separation of recorder and violin on the piece by Mozart

musisep.dictsep.__main__.separate_mozart_recorder_violin_mel()[source]¶: Separation of recorder and violin on the piece by Mozart

musisep.dictsep.__main__.separate_urmp()[source]¶: Separation of selected samples from the URMP dataset.

musisep.dictsep.adam_b module¶

Module containing the modified ADAM algorithm.

class musisep.dictsep.adam_b.Adam_B(init, lo=0, hi=1, alpha=0.001, beta1=0.9, beta2=0.999, eps=1e-08)[source]¶

Bases: object

Object for the ADAM algorithm with bounds, adapted for the update of instrument dictionaries. Each column refers to one instruments, and the harmonics are in rows.

Parameters

init (array-like) – Initial value for the dictionary
lo (float) – Lower bound for the dictionary entries
hi (float) – Upper bound for the dictionary entries
alpha (float) – Global step-size
beta1 (float) – Inertia of the first moment estimator
beta2 (float) – Inertia of the second moment estimator
eps (float) – Value to add in the denominator to avoid division by zero

reset(i)[source]¶

Reset an instrument to its initial state.

Parameters: i (int) – Number of the instrument

step(stepdir)[source]¶

Update the dictionary.

Parameters: stepdir (array-like) – Step direction (negative gradient)
Returns: theta – New value of the dictionary
Return type: ndarray

musisep.dictsep.dictlearn module¶

Module for the training of the dictionary. When invoked, a performance test on artificial data is performed.

class musisep.dictsep.dictlearn.Learner(fsigma, tone_num, inst_num, har, m, minfreq, maxfreq, lifetime, pexp, qexp, init=None)[source]¶

Bases: object

Container object for the dictionary learning process.

Parameters

fsigma (float) – Standard deviation (frequency)
tone_num (int) – Maximum number of simultaneous tones for each instrument
inst_num (int) – Number of instruments in the dictionary
har (int) – Number of harmonics
m (int) – Height of the log-frequency spectrogram
minfreq (float) – Minimum frequency to be represented (included)
maxfreq (float) – Maximum frequency to be represented (excluded)
lifetime (int) – Number of steps after which to renew the dictionary
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
init (array_like) – Initial value for the dictionary

get_dict()[source]¶

Get the active part of the dictionary.

Returns: inst_dict – Dictionary with inst_num columns
Return type: ndarray

learn(y)[source]¶

Learning step. Automatically renews the dictionary.

Parameters: y (array_like) – Log-frequency spectrum
Returns: reconstruction – Synthesized spectrum
Return type: ndarray

renew_dict(headstart, newinsts)[source]¶

Renew the dictionary.

Parameters

headstart (int) – Headstart in the lifetime counter (to help new instruments)
newinsts (int) – Number of instruments to be renewed

musisep.dictsep.dictlearn.gen_random_inst(har)[source]¶

Generate random harmonic amplitudes according to a Par(1,2) distribution.

Parameters: har (int) – Number of harmonics
Returns: inst – Harmonic amplitudes for one instrument, unified to an interval of [0,1]
Return type: ndarray

musisep.dictsep.dictlearn.gen_random_inst_dict(har, inst_num)[source]¶

Generate a random instrument dictionary according to a Par(1,2) distribution.

Parameters

har (int) – Number of harmonics
inst_num (int) – Number of instruments

Returns

inst_dict – Dictionary with instruments in columns, unified to an interval of [0,1]

Return type

ndarray

musisep.dictsep.dictlearn.learn_spect_dict(spect, fsigma, tone_num, inst_num, pexp, qexp, har, minfreq, maxfreq, runs, lifetime)[source]¶

Train the dictionary containing the relative amplitudes of the harmonics.

Parameters

spect (array_like) – Original log-frequency spectrogram of the recording
fsigma (float) – Standard deviation (frequency)
tone_num (int) – Maximum number of simultaneous tones for each instrument
inst_num (int) – Number of instruments in the dictionary
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
har (int) – Number of harmonics
minfreq (float) – Minimum frequency in Hz to be represented (included)
maxfreq (float) – Maximum frequency in Hz to be represented (excluded)
runs (int) – Number of training iterations to perform
lifetime (int) – Number of steps after which to renew the dictionary

Returns

inst_dict – Dictionary containing the relative amplitudes of the harmonics

Return type

ndarray

musisep.dictsep.dictlearn.make_closures(fsigma)[source]¶

Build the functions that give the bounds and initial values.

Parameters

fsigma (float) – Standard deviation of the Gaussian in the time domain.

Returns

make_bounds (lambda (length)) – Lambda that gives the bounds for length peaks
make_inits (lambda (length)) – Lambda that gives the initial values length peaks

musisep.dictsep.dictlearn.mask_spectrums(spects, orig_spect)[source]¶

Mask the synthesized spectrograms with the original spectrogram.

Parameters

spects (list of array_like) – List of synthesized spectrograms
orig_spect (array_like) – Original spectrogram

Returns

spectrums (list of ndarray) – Masked spectrograms
mask_spect (ndarray) – Array mask

musisep.dictsep.dictlearn.stoch_grad(y, inst_dict, tone_num, adam, fsigma, harscale, baseshift, inst_spect, pexp, qexp)[source]¶

Perform a dictionary training step.

Parameters

y (array_like) – Log-frequency spectrum to represent
inst_dict (ndarray) – Dictionary containing the relative amplitudes of the harmonics
tone_num (int) – Maximum number of simultaneous tones for each instrument
adam (Adam_B) – Container object for the ADAM optimizer
fsigma (float) – Standard deviation (frequency)
harscale (float) – Scaling factor
baseshift (int) – Length to add to the spectrum in order to avoid circular convolution
inst_spect (array_like) – Spectra of the instruments, in the columns
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum

Returns

inst_dict (ndarray) – Updated dictionary
reconstruction (ndarray) – Synthesized spectrum
inst_amps (ndarray) – Summed amplitudes for each instruments

musisep.dictsep.dictlearn.synth_spect(spect, tone_num, inst_dict, fsigma, spectheight, pexp, qexp, minfreq, maxfreq, stretch=1)[source]¶

Separate and synthesize the spectrograms from the original spectrogram.

Parameters

spect (array_like) – Original log-frequency spectrogram of the recording
tone_num (int) – Maximum number of simultaneous tones for each instrument
inst_dict (ndarray) – Dictionary containing the relative amplitudes of the harmonics
fsigma (float) – Standard deviation (frequency)
spectheight (int) – Height of the linear-frequency spectrograms
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
minfreq (float) – Minimum frequency to be represented (included) (normalized to the sampling frequency)
maxfreq (float) – Maximum frequency to be represented (excluded) (normalized to the sampling frequency)

Returns

dict_spectrum (ndarray) – Synthesized log-frequency spectrogram with all instruments
inst_spectrums (list of ndarray) – List of synthesized log-frequency spectrograms for the instruments
dict_spectrum_lin (ndarray) – Synthesized linear-frequency spectrogram with all instruments
inst_spectrums_lin (list of ndarray) – List of synthesized linear-frequency spectrograms for the instruments

musisep.dictsep.dictlearn.test_learn(fsigma, tone_num, inst_num, pexp, qexp, har, m, runs, test_samples, lifetime, inst_dict)[source]¶

Evaluate the performance of the dictionary learning algorithm via artificial spectra.

Parameters

fsigma (float) – Width of the Gaussians in the log-frequency spectrogram
tone_num (int) – Maximum number of simultaneous tones for each instrument
inst_num (int) – Number of instruments in the dictionaries
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
har (int) – Number of harmonics
m (int) – Height of the log-frequency spectrogram
runs (int) – Number of training iterations to perform
test_samples (int) – Number of test spectra to generate
lifetime (int) – Number of steps after which to renew the dictionary
inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics

Returns

measures – Array containing, in that order, the SDR, SIR, SAR with the original dictionary and the SDR, SID, SAR with the trained dictionary

Return type

ndarray

musisep.dictsep.dictlearn.test_learn_multi(fsigma, tone_num, inst_num, pexp, qexp, har, m, runs, test_samples, lifetime, num_dicts)[source]¶

Evaluate the performance of the dictionary learning algorithm via artificial spectra.

Parameters

fsigma (float) – Width of the Gaussians in the log-frequency spectrogram
tone_num (int) – Maximum number of simultaneous tones for each instrument
inst_num (int) – Number of instruments in the dictionaries
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
har (int) – Number of harmonics
m (int) – Height of the log-frequency spectrogram
runs (int) – Number of training iterations to perform
test_samples (int) – Number of test spectra to generate
lifetime (int) – Number of steps after which to renew the dictionary
num_dicts (int) – Number of different dictionaries to generate and train

Returns

measures – Array containing, in the rows, the SDR, SIR, SAR with the original dictionary and the SDR, SID, SAR with the trained dictionary

Return type

ndarray

musisep.dictsep.exptool module¶

Back-end module for the generation of spectrograms and their gradients.

musisep.dictsep.exptool.inst_scale()¶

musisep.dictsep.exptool.inst_scale_grad()¶

musisep.dictsep.exptool.inst_shift()¶

musisep.dictsep.exptool.inst_shift_dict_grad()¶

musisep.dictsep.exptool.inst_shift_grad()¶

musisep.dictsep.pursuit module¶

Module for the sparse pursuit algorithm and its helper functions.

class musisep.dictsep.pursuit.Peaks(amps, shifts, params, insts)[source]¶

Bases: object

Object to represent the parameters for the peaks in the spectrogram.

Parameters

amps (array_like) – Amplitudes
shifts (array_like) – Fundamental frequencies
params (array_like) – Extra parameters (in the rows)
insts (array_like) – Instrument numbers

copy()[source]¶

Returns: Copy of the contained peak parameters
Return type: Peaks

classmethod empty(params)[source]¶

Construct an empty Peaks object.

Returns: A Peaks object with zero peaks
Return type: Peaks

classmethod from_array(array, insts, paramlen)[source]¶

Construct a Peaks object from an array.

Parameters

array (array_like) – Array that contains, in consecutive order, the amplitudes, the fundamental frequencies, the standard deviations, and the inharmoniticies
insts (array_like) – Instrument numbers
paramlen (int) – Number of extra parameters

get_array()[source]¶

Returns: Array that contains, in consecutive order, the amplitudes, the fundamental frequencies, the standard deviations, and the inharmoniticies
Return type: array_like

get_params()[source]¶

Returns

amps (ndarray) – Amplitudes
shifts (ndarray) – Fundamental frequencies
params (ndarray) – Extra parameters
insts (ndarray) – Instrument numbers

merge(new)[source]¶

Merge the Peaks object with another Peaks object contained in new by concatenating the parameters.

Parameters: new (Peaks) – Object to merge with

musisep.dictsep.pursuit.calc_harscale(minfreq, maxfreq, numfreqs)[source]¶

Calculate the scaling factor of the frequency axis for the log-frequency spectrogram.

Parameters

minfreq (float) – Minimum frequency to be represented (included)
maxfreq (float) – Maximum frequency to be represented (excluded)
numfreqs (int) – Intended height of the spectrogram

Returns

harscale – Scaling factor

Return type

float

musisep.dictsep.pursuit.fft_selector(y, prenum, baseshift, inst_spect, qexp)[source]¶

Callback selector to find fundamental frequencies based on the correlation of the spectrum with the instrument spectra.

Parameters

y (array_like) – Spectrum
prenum (int) – Number of peaks to consider
baseshift (int) – Length to add to the spectrum in order to avoid circular convolution
inst_spect (array_like) – Spectra of the instruments, in the columns
qexp (float) – Exponent to be applied on the spectrum

Returns

amps (array_like) – Amplitudes
shifts (array_like) – Fundamental frequencies
insts (array_like) – Instrument numbers

musisep.dictsep.pursuit.gen_inst_spect(baseshift, fsigma, fixed_params, pexp, qexp, m, n)[source]¶

Generate an instrument log-frequency spectrum.

Parameters

baseshift (int) – Length to add to the spectrum in order to avoid circular convolution
fsigma (float) – Standard deviation (frequency)
fixed_params (sequence) – Extra fixed parameters for the synthesizer
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
m (int) – Height of the spectrogram
n (int) – Number of patterns/instruments

Returns

inst_spect – Spectra of the instruments, in the columns

Return type

ndarray

musisep.dictsep.pursuit.inst_scale(peaks, inst_dict, pexp, m)[source]¶

Synthesize the linear-frequency spectrum.

Parameters

peaks (Peaks) – Peak parameters
inst_dict (array_like) – Dictionary containing the relative amplitudes of the harmonics
pexp (float) – Exponent for the addition of sinusoids
m (int) – Height of the spectrogram

Returns

Linear-frequency spectrum

Return type

ndarray

musisep.dictsep.pursuit.inst_shift(peaks, fixed_params, pexp, m)[source]¶

Synthesize the log-frequency spectrum.

Parameters

peaks (Peaks) – Peak parameters
fixed_params (sequence) – Extra fixed parameters for the synthesizer
pexp (float) – Exponent for the addition of sinusoids
m (int) – Height of the spectrogram

Returns

Log-frequency spectrum

Return type

ndarray

musisep.dictsep.pursuit.inst_shift_dict_grad(peak_array, insts, fixed_params, pexp, qexp, m, y)[source]¶

Least-squares gradient function for the log-frequency spectrum w.r.t. the dictionary.

Parameters

peak_array (array_like) – Peak parameters in array form
insts (array_like) – Instrument numbers
fixed_params (sequence) – Extra fixed parameters for the synthesizer
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
m (int) – Height of the spectrogram
y (array_like) – Spectrum to compare with

Returns

grad – Least-squares gradient w.r.t. the dictionary

Return type

ndarray

musisep.dictsep.pursuit.inst_shift_grad(peak_array, insts, fixed_params, pexp, qexp, m, y)[source]¶

Least-squares gradient function for the log-frequency spectrum w.r.t. the parameters.

Parameters

peak_array (array_like) – Peak parameters in array form
insts (array_like) – Instrument numbers
fixed_params (sequence) – Extra fixed parameters for the synthesizer
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
m (int) – Height of the spectrogram
y (array_like) – Spectrum to compare with

Returns

grad – Least-squares gradient

Return type

ndarray

musisep.dictsep.pursuit.inst_shift_obj(peak_array, insts, fixed_params, pexp, qexp, m, y)[source]¶

Least-squares objective function for the log-frequency spectrum.

Parameters

peak_array (array_like) – Peak parameters in array form
insts (array_like) – Instrument numbers
fixed_params (sequence) – Extra fixed parameters for the synthesizer
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
m (int) – Height of the spectrogram
y (array_like) – Spectrum to compare with

Returns

obj – Least-squares error

Return type

float

musisep.dictsep.pursuit.max_selector(y, prenum, n)[source]¶

Callback selector to find peaks based on the local maxima which are dominant in a discrete interval, viewed from its midpoint.

Parameters

y (array_like) – Spectrum
prenum (int) – Number of peaks to consider
n (int) – Length of the interval

Returns

amps (array_like) – Amplitudes
shifts (array_like) – Frequencies
insts (array_like) – Instrument numbers (always 0)

musisep.dictsep.pursuit.peak_pursuit(y, nums, prenum, runs, n, inst_shift, inst_shift_obj, inst_shift_grad, make_bounds, make_inits, fixed_params, selector, selector_args, pexp, qexp, beta=1, init=None)[source]¶

Sparse pursuit algorithm for the identification of peaks in a spectrum.

Parameters

y (array_like) – Spectrum
num (int) – Maximum number of peaks
prenum (int) – Number of new peaks to consider per iteration
runs (int) – Maximum number of training iterations
n (int) – Number of patterns/instruments
inst_shift (callable (peaks, fixed_params, pexp, m, n)) – Synthesizing function
inst_shift_obj (callable (peak_array, insts, fixed_params, pexp, qexp, m, n, y)) – Synthesizing function objective
inst_shift_grad (callable (peak_array, insts, fixed_params, pexp, qexp, m, n, y)) – Synthesizing function gradient
make_bounds (lambda (length)) – Lambda that gives the bounds for length peaks
make_inits (lambda (length)) – Lambda that gives the initial values length peaks
fixed_params (sequence) – Extra fixed parameters for the synthesizer
selector (function) – Callback selector accepting y and prenum as arguments
selector_args (sequence) – Extra arguments to pass to the selector
pexp (float) – Exponent for the addition of sinusoids
qexp (float) – Exponent to be applied on the spectrum
beta (float) – Residual reduction factor
init (Peaks) – Initial value for the peaks

Returns

peaks (Peaks) – Identified peaks
reconstruction (ndarray) – Synthesized spectrum

musisep.dictsep.pursuit.test_pattern(peaks, fixed_params, pexp, m)[source]¶

Evaluate a test pattern.

Parameters

peaks (Peaks) – Continuous parameters for the peaks
fixed_params (sequence) – Extra fixed parameters
pexp (float) – (ignored)
m (int) – (ignored)

Returns

y – Sampled test pattern

Return type

ndarray

musisep.dictsep.pursuit.test_pattern_comp(x, amps, shifts, sigmas)[source]¶

Evaluate a test pattern.

Parameters

x (array_like) – Positions to evaluate the pattern
amps (array_like) – Amplitudes of the peaks
shifts (array_like) – Positions of the peaks on the axis
sigmas (array_like) – Standard deviations of the peaks

Returns

y – Evaluation of the test pattern

Return type

ndarray

musisep.dictsep.pursuit.test_pattern_gen(seed, scaling)[source]¶

Generate the parameters for a random pattern.

Parameters

seed (int) – Random seed
scaling (float) – Scaling of the axis

Returns

amps (ndarray) – Amplitudes of the peaks
shifts (ndarray) – Positions of the peaks on the axis
sigmas (ndarray) – Standard deviations of the peaks

musisep.dictsep.pursuit.test_pattern_grad(peak_array, insts, fixed_params, pexp, qexp, m, y)[source]¶

Gradient for a test pattern.

Parameters

peak_array (array_like) – Array of all the continuous parameters
insts (array_like) – Number of the instruments/patterns
fixed_params (sequence) – Extra fixed parameters
pexp (float) – (ignored)
qexp (float) – (ignored)
m (int) – (ignored)
y (array_like) – Evaluation of the test pattern

Returns

grad – Gradient for the test pattern

Return type

ndarray

musisep.dictsep.pursuit.test_pattern_grad_helper(x, r, amps, shifts, pat_amps, pat_shifts, pat_sigmas)[source]¶

Helper function for the computation of the gradient of the test pattern.

Parameters

x (array_like) – Positions where pattern was evaluated
r (array_like) – Residual of the pattern
amps (array_like) – Amplitudes of the patterns
shifts (array_like) – Shifts of the peaks
pat_amps (array_like) – Amplitudes of the peaks for each pattern
pat_shifts (array_like) – Positions of the peaks on the axis for each pattern
pat_sigmas (array_like) – Standard deviations of the peaks for each pattern

Returns

grad – Gradient for the test pattern

Return type

ndarray

musisep.dictsep.pursuit.test_pattern_obj(peak_array, insts, fixed_params, pexp, qexp, m, y)[source]¶

Loss objective for a test pattern.

Parameters

peak_array (array_like) – Array of all the continuous parameters
insts (array_like) – Number of the instruments/patterns
fixed_params (sequence) – Extra fixed parameters
pexp (float) – (ignored)
qexp (float) – (ignored)
m (int) – (ignored)
y (array_like) – Evaluation of the test pattern

Returns

obj – Least-squares error

Return type

float

musisep.dictsep.pursuit.test_pursuit()[source]¶: Testing the pursuit algorithm on a generic example.

musisep.dictsep package¶

Submodules¶

musisep.dictsep.main module¶

musisep.dictsep.adam_b module¶

musisep.dictsep.dictlearn module¶

musisep.dictsep.exptool module¶

musisep.dictsep.pursuit module¶

Module contents¶

Musisep

Navigation

Related Topics

musisep.dictsep package¶

Submodules¶

musisep.dictsep.__main__ module¶

musisep.dictsep.adam_b module¶

musisep.dictsep.dictlearn module¶

musisep.dictsep.exptool module¶

musisep.dictsep.pursuit module¶

Module contents¶

musisep.dictsep.main module¶