Neuronal Dynamics (10)

Estimating Parameters of Probabilistic Neuron Models

The online version of this chapter:


Chapter 10 Estimating Parameters of Probabilistic Neuron Models https://neuronaldynamics.epfl.ch/online/Ch10.html


Parameter optimization in linear and nonlinear models

Linear Models

Suppose the maximal amplitude of the input current has been chosen small enough for the neuron to stay in the subthreshold regime. \[ u(t)=\int_{0}^{\infty} \kappa(s)I(t-s) \mathrm{d}s+u_{rest} \tag{10.1} \] vector \[ \mathbf{k}=(\kappa(\mathrm{d}t),\kappa(2 \mathrm{d}t),\cdots ,\kappa(K \mathrm{d}t)). \tag{10.2} \] which describes the time course \(\kappa\) in discrete time. The input current \(I\) during the last \(k\) time steps is given by \[ \mathbf{x}_{t}=(I_{t-1},\cdots ,I_{t-K})\mathrm{d}t \tag{10.3} \]

The discrete-time version of (10.1) is \[ u_t=\sum_{l=1}^{K} k_lI_{t-l}\mathrm{d}t+u_{rest}=\mathbf{k}\cdot \mathbf{x}_t+u_{rest}. \tag{10.4} \]

Prediction \(u_t\) of the model equation (10.4) with the experimental measurement \(u_t^{exp}\). The components of the vector \(\mathbf{k}\) will be chosen such that \[ E(\mathbf{k})=\sum_{t=K+1}^{T} [u_t^{exp}-u_t]^{2} \tag{10.5} \] is minimal.

Example: Analytical solution

Let

\[ X= \begin{pmatrix} I_{K} & I_{K-1} & \cdots & I_1 \\ I_{K+1} & I_{K} & \cdots & I_2 \\ \vdots & \vdots & & \vdots \\ I_{T} & I_{T-1} & \cdots & I_{T-K+1} \\ \end{pmatrix} \mathrm{d}t \]

\(\mathbf{u}^{exp}=(u_{K+1}^{exp},\cdots ,u_{T}^{exp})^{\mathsf{T}}\),\(\mathbf{u}_{rest}\) is a vector with all components equal to \(u_{rest}\).

(10.4) can be written as \[ \mathbf{u}=X \mathbf{k}^{\mathsf{T}}+\mathbf{u}_{rest} \tag{10.6} \] (10.5) indicates \[ \nabla _{\mathbf{k}} E=0 \]

Since \[ E(\mathbf{k})=[u^{exp}-X \mathbf{k}-\mathbf{u}_{rest}]^{\mathsf{T}}\cdot [u^{exp}-X \mathbf{k}-\mathbf{u}_{rest}] \tag{10.7} \]

The solution is the parameter vector \[ \hat{\mathbf{k}}_{LS}=(X^{\mathsf{T}}X)^{-1}X^{\mathsf{T}}(\mathbf{u}^{exp}-\mathbf{u}_{rest}) \tag{10.8} \]

assuming the matrix \(X^{\mathsf{T}}X\) is invertible. (If this matrix is non-invertible, then a unique minimum does not exist.)

Generalized Linear Models

Deterministic formulation of the SRM, \[ u(t)=\int_{0}^{\infty} \eta(s)S(t-s) \mathrm{d}s+\int_{0}^{\infty} \kappa(s)I(t-s) \mathrm{d}s+u_{rest}. \tag{10.9} \]

Suppose that the spike history filter \(\eta\) extends over a maximum of \(J\) time steps. Then we can introduce a new parameter vector. \[ \mathbf{k}=(\kappa(\mathrm{d}t),\kappa(2 \mathrm{d}t),\cdots ,\kappa(K \mathrm{d}t),\eta(\mathrm{d}t),\eta(2 \mathrm{d}t),\cdots ,\eta(J \mathrm{d}t),u_{rest})\tag{10.10} \]

The spike train in the last \(J\) time steps is represented by the spike count sequence \(n_{t-1},n_{t-2},\cdots ,n_{t-J}\), where \(n_t \in \{0,1\}\), and included into the 'input' vector \[ \mathbf{x}_t=(I_{t-1}\mathrm{d}t,I_{t-2}\mathrm{d}t,\cdots ,I_{t-K}\mathrm{d}t,n_{t-1},n_{t-2},\cdots ,n_{t-J},1). \tag{10.11} \] So the discrete-time version \[ u_t=\sum_{j=1}^{J} k_{K+j}n_{t-j}+\sum_{k=1}^{K} k_{k}I_{t-k}\mathrm{d}t+u_{rest}=\mathbf{k}\cdot \mathbf{x}_t \tag{10.12} \]

We have the firing intensity \[ \rho(t)=f(u(t)-\theta)=f(\mathbf{k}\cdot \mathbf{x}_t-\theta), \tag{10.13} \]

Statistical Formulation of Encoding Models

A neural 'encoding model' is a model that assigns a conditional probability, \(p(D|\mathbf{x})\), to any possible neural response \(D\) given a stimulus \(\mathbf{x}\). We hypothesize some encoding model, \[ p(D|\mathbf{x},\theta). \tag{10.16} \] Here \(\theta\) is a short-hand notation for the set of all model parameters. In the example of the previous section, the model parameters are \(\theta=\{\mathbf{k}\}\)

Parameter estimation

Find a good estimate for \(\theta\) for a chosen model class: - Introduce a model that makes sense biophysically, and incorporates our prior knowledge in a tractable manner. - Write down the likelihood of the observed data given the model parameters, along with a prior distribution that encodes our prior beliefs about the model parameters. - Compute the posterior distribution of the model parameters given the observed data, using Bayes'rule, \[ p(\theta|D)\propto p(D|\theta)p(\theta);\tag{10.18} \]

The maximum likelihood (ML): \[ ML:\hat{k}_{ML}=\mathop{\arg\max}_\mathbf{k}\{p(D|X,\mathbf{k})\} \]

maximum a posteriori (MAP): \[ MAP:\hat{k}_{MAP}=\mathop{\arg\max}_\mathbf{k}\{p(D|X,\mathbf{k})p(\mathbf{k})\} \]

where the maximization runs over all possible parameter choices.

Assume that spike counts per bin follows \[ n_t \sim Poiss[\rho(t)\mathrm{d}t] \]

With the rate parameter of the Poisson distribution given by GLM or SRM model \(\rho(t)=f(\mathbf{k}\cdot \mathbf{x}_t)\), we have \[ p(D|X,\mathbf{k})=\prod_{t}^{} \left\{ \frac{[f(\mathbf{k}\cdot \mathbf{x}_t) \mathrm{d}t]^{n_t}}{(n_t)!}\exp [-f(\mathbf{x}_t\cdot \mathbf{k})\mathrm{d}t]\right\} \]

For a given observed spike train, the spike count numbers \(n_t\) are fixed, so we treat them as constants. We reshuffle the terms and consider the logarithm, \[ \log p(D|X,\mathbf{k})=c_0+\sum_{t} \{ n_t \log f(\mathbf{k}\cdot \mathbf{x}_t)-f(\mathbf{x}_t\cdot \mathbf{k})\mathrm{d}t\} \tag{10.23} \]

If we assume that - \(f(u)\) is a convex (upward-curving) function of its scalar argument \(u\) - \(\log f(u)\) is concave (downward-curving) in \(u\), \[ \mathbf{x}_t=(1,I_{t-1}\mathrm{d}t,I_{t-2}\mathrm{d}t,\cdots ,I_{t-K}\mathrm{d}t,n_{t-1},n_{t-2},\cdots ,n_{t-J}). \tag{10.24} \]

the parameter vector \[ \mathbf{k}=(b,\kappa(\mathrm{d}t),\kappa(2\mathrm{d}t),\cdots ,\kappa(K\mathrm{d}t),\eta(\mathrm{d}t),\eta(2\mathrm{d}t),\cdots ,\eta(J\mathrm{d}t)); \tag{10.25} \]

here \(b=u_{rest}-\theta\) is a constant offset term which we want to optimize.

then (10.23) is guaranteed to be a concave function of \(\mathbf{k}\).

Note that, \(\rho(t)\) depends on the past spike trains, therefore \(D=\{n_t\}\) is no longer a Poisson process.

Finally, we expand the definition of \(X\) to include observations of other spike trains. Spike counts are conditionally Poisson distributed given \(\rho_i(t)\). \(n_{i,t}\sim Poiss(\rho_i(t)\mathrm{d}t)\) with a firing rate \[ \rho_i(t)=f\left(\mathbf{k}_i\cdot \mathbf{x}_t+\sum_{i'\neq i}^{}\sum_{j}^{} \varepsilon_{i',j}n_{i',t-j}\right) \]

these terms are summed over all past spike activity \(n_{i',i-j}\) in the population of cells.

\(\rho_i(t)\): the instantaneous firing rate of the \(i\)-th cell at time \(t\) \(\mathbf{k}_i\): the cell's linear receptive field including spike-history effects. \(\varepsilon_{i',j}\): the net effect of a spike of neuron \(i'\) onto the membrane potential of neuron \(i\). If we record from all neurons in the population, \(\varepsilon_{i',j}\) can be interpreted as the excitatory or inhibitory postsynaptic potential caused by a spike of neuron \(i'\) a time \(j \mathrm{d}t\) earlier.

Example: Linear regression and voltage estimation

We want to show that the standard procedure of least-square minimization can be linked to statitical parameter estimation under the assumption of Gaussian noise.

Set \(\mathbf{x}_t=(I_{t-1},\cdots ,I_{t-K})\mathrm{d}t\) and \(\mathbf{k}=(\kappa(\mathrm{d}t),\cdots ,\kappa(K\mathrm{d}t))\).

If we assume that the discrete-time voltage measurements have a Gaussian distribution around the mean predicted by (10.4), then we need to maximize the likelihood. \[ \log p(D|X,\mathbf{k})=c_1-c_2 \sum_{t}^{} (u_t-(\mathbf{k}\cdot \mathbf{x}_t))^{2}, \tag{10.28} \] where \[ X=\begin{pmatrix} \mathbf{x}_1/\mathrm{d}t \\ \vdots \\ \mathbf{x}_T/\mathrm{d}t \end{pmatrix} \]

\(c_1,c_2\) are constants that do not depend on the parameter \(\mathbf{k}\). Maximization yields \(\mathbf{k}_{opt}=(X^{\mathsf{T}}X)^{-1}(\sum_{t}^{} u_t \mathbf{x}_t/\mathrm{d}t)\) which determines the time course of the filter \(\kappa(s)\). The result is identical to (10.8): \[ \hat{\mathbf{k}}_{opt}=\hat{\mathbf{k}}_{LS} \]

Regularization: maximum penalized likelihood

We want to maximize the posterior \[ p(\mathbf{k}|X,D)\propto p(D|X,\mathbf{k})p(\mathbf{k}) \tag{10.30} \] here \(p(\mathbf{k})\) encodes our a priori beliefs about the true underlying \(\mathbf{k}\).

\[ \log p(k|X,\mathbf{D})=c+\log p(\mathbf{k})+\sum_{t}^{} (n_t \log f(\mathbf{x}_t\cdot \mathbf{k})-f(\mathbf{x}_t\cdot \mathbf{k})\mathrm{d}t). \tag{10.32} \]

Example: Linear regression and Gaussian prior

Consider a zero-mean Gaussian \[ \log p(\mathbf{k})=c-\mathbf{k}^{\mathsf{T}}A \mathbf{k}/2. \] where \(A\) is a positive definite matrix (the inverse covariance matrix). Combining with (10.28), maximizing the corresponding posterior leads directly to the regularized least-square estimator \[ \hat{\mathbf{k}}_{RLS}=(X^{\mathsf{T}}X+A)^{-1}\left(\sum_{t}^{} u_t \mathbf{x}_t/\mathrm{d}t\right) \]

Evaluating Goodness-of-fit

In the following we assume that the goodness of fit quantities are computed using 'cross-validation': parameters are estimated using the training set, and then the goodness of fit quantification is performed on the test set.

Comparing Spiking Membrane Potential Recordings

这里提到,如果使用最小平方误差来优化模型,那么实际上隐含了剩余误差呈正态分布的假设。考虑到只有当所有电压轨迹都未达到threshold时,电压分布才是正态的(这里忽略了分布概率趋近于0的那一部分)。

下面是[1]中的一段话,也可以部分解释最小二乘法与高斯分布的误差之间的密切关系:


The use of the least squares method for extracting information from imperfect observations assumes a specific a priori probability distribution for the errors, viz. the Gauss distribution. The same assumption, however, cannot be true for all variable that might be used to measure the observed quantity (but at most for one variable, and all those that are linearly connected with it). The method of least squares applied in the frequency scale does not lead to the same result as when applied to the same observations plotted according to wavelengths. The 'best value' for the brightness of a star depends on whether one applies the method of least squares to the magnitude or to its intensity in energy measure. The redeeming feature is, as long as the errors are small, that any reasonable transformation is practically linear in the relevant range. But there is no logical foundation for applying it to widely scattered data.


The goodness-of-fit in terms of subthreshold membrane potential away from spikes is considered separately from the goodness-of-fit in terms of the spike times only.

We compute the squared error between the recorded membrane potential \(u_t^{exp}\) and model membrane potential \(u_t^{mod}\) with forced spikes at the times of the observed ones. All voltage traces start at the same point and repeated \(N_{rep}\) times. For subthreshold membrane potential, we can average the squared error over all recorded times \(t\) that are not too close to an action potential: \[ RMSE_{nm}=\sqrt{\frac{1}{T_{\Omega_1}N_{rep}}\sum_{i=1}^{N_{rep}} \int_{\Omega_1}^{} (u_i^{exp}(t)-u_i^{mod}(t))^{2} \mathrm{d}t} \tag{10.37} \] where \(\Omega_1\) refers to the ensemble of time bins at least 5 ms before or after any spikes and \(T_{\Omega_1}\) is the total time in \(\Omega_1\). \(RMSE_{nm}\) has index \(n\) for 'neuron' and index \(m\) for 'model'.

For spike times, we find times which are sufficiently away from a spike in any repetition and compute the average squared error \[ RMSE_{nn}=\sqrt{\frac{2}{T_{\Omega_2}N_{rep}(N_{rep}-1)}\sum_{i=1}^{N_{rep}} \sum_{j=1}^{i-1} \int_{\Omega_2}^{} (u_j^{exp}j(t)-u_i^{exp}(t))^{2} \mathrm{d}t} \tag{10.38} \] where \(\Omega_2\) refers to the ensemble of time bins far from the spike times in any repetition and \(T_{\Omega_2}\) is the total time in \(\Omega_2\). Typically, 20 ms before and 200 ms after the spike is considered sufficiently far (spike afterpotentials can extend for longer, so its a rather bad approximation). Because the earlier spiking history will affect the membrane potential, the \(RMSE_{nn}\) calculated in (10.38) is an overestimate.

We compute the model error with the intrinsic error by taking the ratio \[ RMSER=\frac{RMSE_{nn}}{RMSE_{nm}} \tag{10.39} \]

The root-mean-squared-error ratio (RMSER) reaches one if the model precision is matched with intrinsic error. When smaller than one, the RMSER indicates that the model could be improved. Values larger than one are possible because \(RMSE_{nn}\) is an overestimate of the true intrinsic error.

Spike Train Likelihood

\[ L^{n}(S|\theta)=\prod_{t^{(i)}\in S}^{} \rho(t^{(i)}|S,\theta)\exp \left[ -\int_{0}^{T} \rho(s|S,\theta) \mathrm{d}s\right] \tag{10.40} \] where we used \(\rho(t^{(i)}|S,\theta)\) to emphasize that the firing intensity of a spike at \(t^{(i)}\) depends on both the stimulus and spike history as well as the model parameter \(\theta\).

We compare \(L^{n}\) with the likelihood of a homogeneous Poisson model with a constant firing intensity \(\rho_0=n/T\). The difference in log-likelihood between the Poisson model and the neuron model is finally divided by the total number \(n\) of observed spikes in order to obtain a quantity with units of 'bits per spike': \[ \frac{1}{n}\log _{2}\frac{L^{n}(S|\theta)}{\rho_0^{n}\mathrm{e}^{-\rho T} } \tag{10.41} \]

This quantity can be interpreted as an instantaneous mutual information between the spike count in a single time bin and the stimulus given the parameters.

Time-rescaling Theorem

For a spike train with spikes at \(t^{(1)}<t^{(2)}<\cdots <t^{(n)}\) and with firing intensity \(\rho(t|S,\theta)\), the time-rescaling transformation \(t \to \Lambda(t)\) is defined as \[ \Lambda(t)=\int_{0}^{t} \rho(x|S,\theta) \mathrm{d}x. \tag{10.42} \]

Somewhat suprisingly, \(\Lambda(t^{(k)})\) (evaluated at the measured firing times) is a Poisson process with unit rate. A correlate of this time-rescaling theorem is that the time intervals \[ \Lambda(t^{(k)})-\Lambda(t^{(k-1)}) \tag{10.43} \] are independent random variables with an exponential distribution. Re-scaling again the time axis with the transformation \[ z_k=1-\exp [-(\Lambda(t^{(k)})-\Lambda(t^{(k-1)}))] \tag{10.44} \] forms independent uniform random variables on the interval zero to one.

To verify that the \(z_k\)'s are independent, we can look at the serial correlation of the interspike intervals or use a scatter plot \(z_{k+1}\) against \(z_k\). Testing whether the \(z_k\)'s are uniformly distributed can be done with a Kolmogorov-Smirnov (K-S) test. In our case, the reference function is the uniform distribution, so that its cumulative is simply \(z\). Thus, \[ D=\text{sup}_{z}\lvert P(z)-z \rvert . \tag{10.45} \]

The time-rescaling theorem along with the K-S test provide a useful goodness-of-fit measure for spike train data with confidence intervals that does not require multiple repetitions.

Spike Train Metric

Evaluating the goodness-of-fit in terms of log-likelihood or the time-rescaling theorem requires that we know that the conditional firing intensity \(\rho(t|S,\theta)\) accurately.

Another approach for comparing spike trains involves defining a metric between spike trains.

Let us consider spike trains as vectors in an abstract vector space, with these vectors denoted with boldface: \(\bold{S}\). For now, consider the general form \[ (\bold{S}_{i}, \bold{S}_{j})=\int_{0}^{T} \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} K_{\Delta}(s,s')S_i(t-s)S_j(t-s') \mathrm{d}s \mathrm{d}s' \mathrm{d}t, \tag{10.46} \] where \(K_{\Delta}\) is a two-dimensional coincidence kernel with a scaling parameter \(\Delta\), and \(T\) is the maximum length of the spike trains. \(K_{\Delta}\) is required to be a non-negative function with a global maximum at \(s=s'=0\). Moreover, \(K_{\Delta}(s,s')\) should fall off rapidly so that \(K_{\Delta}(s,s')\thickapprox 0\) for all \(s,s'>\Delta\). Examples of kernels include \(K_{\Delta}(s,s')=k_1(s)k_2(s')=\frac{1}{\Delta^{2}}\exp [-(s+s')/\Delta]\Theta(s)\Theta(s')\). The scaling parameter \(\Delta\) must be small, much smaller than the length \(T\) of the spike train.

With \(K_{\Delta}(s,s')=\delta(s)\delta(s')\), we observe that \((\bold{S}_{i}, \bold{S}_{i})=\int_{0}^{T} S_i^{2}(t) \mathrm{d}t=n_i\) where \(n_i\) is the number of spikes in \(\bold{S}_i\).

Define distance \(D_{ij}\), between two spike-trains \[ D_{ij}=\left\| \bold{S}_i-\bold{S}_j \right\|_{} \tag{10.47} \] Consider \(K_{\Delta}(s,s')=\delta(s)\delta(s')\), then \(D_{ij}\) is the total number of spikes in both \(\bold{S}_i\) and \(\bold{S}_j\) reduced by 2 for each spike in \(\bold{S}_i\) that coincided with one in \(\mathbf{S}_j\). For the following, it is useful to think of a distance between spike trains as a number of non-coincident spikes.

\[ \cos \theta_{ij}=\frac{(\mathbf{S}_i,\mathbf{S}_j)}{\left\| \mathbf{S}_i \right\|_{}\left\| \mathbf{S}_j \right\|_{}}. \tag{10.48} \]

Comparing Sets of Spike Trains

Let the two sets of spike trains be denoted by \(X\) and \(Y\), containing \(N_{X}\) and \(N_{Y}\) spike trains, respectively. Define \[ \hat{L}_{X}=\frac{1}{N_X}\sum_{i=1}^{N_{X}} \left\| \mathbf{S}_i^{(x)} \right\|_{}^{2}. \tag{10.49} \] where we have used '^' to denote that the quantity is an experimental estimate. \(\hat{L}_{X}\) is related to the averaged spike count. \(L_{X}\) is exactly the averaged spike count if the inner product satisfies i)\(\int_{}^{} \int_{}^{} K_{\Delta}(s,s') \mathrm{d}s \mathrm{d}s'=1\) and ii) \(K_{\Delta}(s,s')=0\) whenever $s-s' $ is greater than the minimum interspike interval of any of the spike trains considered.

The vector of averaged spike trains \[ \hat{\nu}_{X}=\frac{1}{N_X}\sum_{i=1}^{N_{X}} \mathbf{S}_i^{(x)}. \tag{10.50} \] is another occurrence of the spike density. It defines the instantaneous firing rate of the spiking process, \(\nu(t)=\langle \hat{\nu}\rangle\).

The variability is defined as the variance \[ \hat{V}_{X}=\frac{1}{N_{X}}\sum_{i=1}^{N_{X}} \left\| \mathbf{S}_i^{(x)}-\hat{\nu}_{X} \right\|_{}^{2}. \tag{10.51} \]

Variability relates to reliability. While variability is a positive quantity that cannot exceed \(L_X\), reliabiliy is usually defined between zero and one where one means perfectly reliable spike timing: \(\hat{R}_{X}=1-\hat{V}_{X}/\hat{L}_{X}\).

Finally, we come to a measure of match between the set of spike trains \(X\) and \(Y\). To reproduce the detailed time structure of the PSTH, we define \[ \hat{M}=\frac{2(\hat{\nu}_{X},\hat{\nu}_{Y})}{\hat{R}_{X}\hat{L}_{X}+\hat{R}_{Y}\hat{L}_{Y}}. \tag{10.52} \]

We have \(M\) (for match) equal to one if \(X\) and \(Y\) have the same instantaneous firing rate. The smaller \(M\) the greater the mismatch between the spiking processes. The quantity \(R_{X}L_{X}\) can be interpreted as a number of reliable spikes. Since \((\hat{\nu}_{X},\hat{\nu}_{Y})\) is interpreted as a number of coincident spikes between \(X\) and \(Y\), we can still regard \(M\) as a factor counting the fraction of coincident spikes.

If the kernel \(K_{\Delta}(s,s')\) is chosen to be \(k_g(s)k_g(s')\) and \(k_g\) is a Gaussian distribution of width \(\Delta\), then \(M\) relates to a mean square error between PSTHs that were filtered with \(k_g\). Therefore, the kernel used in the definition of the inner product (10.46) can be related to the smoothing filter of the PSTH.

Closed-loop stimulus design

Here we describe how to take advantage of the properties of the GLM to optimize our experiments: the objective is to select, in an online, closed-loop manner, the stimuli that will most efficiently characterize the neuron's response properties.

A property of GLMs: not all stimuli will provide the same amount of information about the unknown coefficients \(\mathbf{k}\). We need a well-defined objective function that will rank any given stimuli according to its potential informativeness.

When the goal is estimating the unknown parameters of a model, given \(D=\{ \mathbf{x}(s),n_s\}_{s<t}\), the observed data up to the current trial. The posterior uncertainty in \(\theta\) can be quantified using the information-theoretic notion of 'entropy'.

Summary

For a suitable chosen model class, the likelihood of the data being generated by the model is a concave function of the model parameters, i.e., there are no local maxima.

Once neuron models are phrased in the language of statistics, the problems of coding and stimulus design can be formulated in a single unified framework.

Comparing PSTHs and spike train similarity measures

Experimentally the PSTH is constructed from a set of \(N_{rep}\) spike trains, \(S_i(t)\), measured from repeated presentations of the same stimulus. The ensemble average of the recorded spike trains: \[ \frac{1}{N_{rep}}\sum_{i=1}^{N_{rep}} S_i(t) \] is typically convolved with a Gaussian function \[ h_g(x)=(2\pi \sigma^{2})^{-1/2}\exp (-x^{2}/2\sigma^{2}) \] with \(\sigma\) around 5 ms, such that \(A_1(t)=(h_g * \frac{1}{N_{rep}}\sum_{}^{} S_i)\) is a smoothed PSTH.

With the kernel \(K(t,t')=h_g(t)h_g(t')\), We have \[ \int_{0}^{T} (A_1(t)-A_2(t))^{2} \mathrm{d}t=\left\| \frac{1}{N_{rep}}\sum_{i=1}^{N_{rep}} (S_1(t)-S_2(t)) \right\|_{}^{2} \]

The correlation coefficient between the two smoothed PSTHs can be written as a angular separation between the sets of spike trains with kernel \(K(t,t')=h_g(t)h_g(t')\).

Victor and Purpura metric

Consider the minimum cost \(C\) required to transform a spike train \(S_i\) into another spike train \(S_j\) if the only transformations available are - Removing a spike has a cost of one - Adding a spike has a cost of one - Shifting a spike by a distance \(d\) has a cost \(qd\) where \(q\) is a parameter defining temporal precision.

The \(C\) defines a metric that measures the dissimilarity between spike train \(S_i\) and spike train \(S_j\).

For \(q=0\) units of cost per seconds, \(C\) becomes the difference in number of spikes in spike trains \(S_i\) and \(S_j\).

For \(q>0\), \(C\) can be written as a distance \(D_{ij}^{2}\) with kernel \(K(t,t')=h_t(t)\delta(t')\) and triangular function \[ h_t(t)=(1-\lvert t \rvert q/2)\Theta(1-\lvert t \rvert q/2) \] as follows \[ C(S_i,S_j)=D_{ij}^{2}=\int_{0}^{T} \int_{-\frac{2}{q}}^{\frac{2}{q}} (1-\lvert s \rvert \frac{q}{2})[S_i(t-s)-S_j(t-s)][S_i(t)-S_j(t)] \mathrm{d}s \mathrm{d}t \]

References

  1. N.G.van Kampen (2007) Stochastic Processes in Physics and Chemistry. ↩︎

Neuronal Dynamics (10)
http://example.com/2022/09/16/Neuronal-Dynamics-10/
Author
John Doe
Posted on
September 16, 2022
Licensed under