SOCR ≫ TCIU Website ≫ TCIU GitHub ≫

In this TCIU section, we discuss the duality of functions and functionals, i.e., generalized functions or distributions. Specifically, we will define distributions as functionals that are function-like objects that act as linear operators on smooth functions. Distributions are necessary as ordinary functions are insufficient to describe many physical phenomena and don’t always permit non-smooth solutions of some differential equation models.

1 Overview

Generalized functions (distributions) play an important role in solving differential equations, harmonic analysis representation of functions, or signals, as superpositions of base functions or basic waves, and for defining complex operations on difficult functions, e.g., defining functional derivatives of functions that are not differentiable. Distributions are related to, but more general than, the more commonly known probability density functions (PDFs), and their counterparts cumulative distribution functions (CDFs).

We will consider the Dirac delta generalized function, \(\delta(x)\), also known as the point mass function. It is not well-defined as a function as it’s infinite at the origin and zero everywhere else, whereas its integral \(\int_{\mathbb{R}}{\delta(x)\ dx}=1\). The Dirac delta is not a real function, but a generalized function, or functional acting as a linear operator on a special class of functions called test functions. In a measure-theoretic sense, functions can take on infinite values at single singularity points (of trivial Lebesgue or Borel measure) without impacting the integral of the function. The Dirac delta generalized function can not be defined as a function that is infinite at \(0\) and integrates to \(1\) since this leads to contradictory interpretations of (1) linear transformations of \(\delta(\cdot)\), e.g., \(5\times \delta=5\times 0=0\), (2) its (function) derivatives, and (3) its Fourier transform. We will see the defining Dirac delta as a generalized function (distribution), permits consistent interpretations of its linear superpositions, its generalized functional derivatives (distribution derivatives), and its Fourier transform.

The power of generalized functions come from the fact that distributions as functionals enable integration, differentiation, and Fourier transformations of a larger class of functions and avoid conflicts like \(5\delta=\delta\). More specifically, generalized functions are not functions of real numbers in the conventional sense, rather they are linear functionals that transform other functions (test functions) to the real numbers. The functions acted upon by the distributions are called “test functions”. Hence, distributions \(f\) act on test functions \(\phi\) as linear functions by mapping the test functions to the real numbers:

\[\underbrace{f}_{distribution}: \underbrace{\phi}_{test\ function} \longrightarrow \underbrace{\int_{\mathbb{R}} {f(x)\ \phi (x)\ dx}}_{\in\ \mathbb{R}}\ .\]

So the distribution \(f\) is loosely speaking a linear function, but more accurately, a linear functional, a member of the dual space of the test functions

\[\underbrace{f}_{distribution}: \underbrace{\alpha\phi+ \beta\psi}_{superposition\ of\\ test\ functions} \longrightarrow \underbrace{\alpha \int_{\mathbb{R}} {f(x)\ \phi (x)\ dx}+ \beta \int_{\mathbb{R}} {f(x)\ \psi (x)\ dx}}_{\in\ \mathbb{R}}\ , \ \forall\ \alpha,\beta\in \mathbb{R}\ .\]

We will illustrate that the delta distribution \(\delta\) is concentrated at zero and it operates on a test function \(\phi\) by returning evaluating the test function at the origin

\[\delta: \phi \longrightarrow \int_{\mathbb{R}} {\delta(x)\ \phi (x)\ dx}=\phi(0)\ ,\]

since \(\int_{\mathbb{R}}{\delta(x)\ dx}=1\) and \(\delta(x)\equiv 0,\ \forall\ x\in \mathbb{R}\setminus\{0\}\).

In general, generalized functions can be represented as limits of traditional functions.

Generalized differentiation (distributional derivatives) are defined to agree with derivatives of conventional functions. Suppose \(f(x)\) is a differentiable univariate function and \(\phi(x)\in \mathcal{C}_c^{\infty}(\mathbb{R})\) is an infinitely differentiable test function that is zero outside of some finite interval. Using classical integration by parts

\[\underbrace{f'}_{distributional\\ derivative\ is\\ a\ distribution} : \phi \longrightarrow \int_{-\infty}^\infty f'(x)\, \phi(x) \, dx = \\ \underbrace{f(x)\, \phi(x) \bigg|_{-\infty}^{\infty}}_{0,\ \phi\in \mathcal{C}_c^{\infty}(\mathbb{R})} -\int_{-\infty}^\infty f(x)\, \phi'(x) \ dx= -\int_{-\infty}^\infty f(x)\, \phi'(x) \ dx\ .\]

Even when the generalized function \(f(x)\) is not differentiable in a classical sense, the distributional derivative definition above is well-defined in terms of the derivative of the test function \(\phi\), which the distribution \(f\) acts on.

Technically, the functional \(f\) not as a function of real numbers, but as a distribution that operates on test functions producing real numbers. In other words, \(f\) is a linear functional on the space of tests functions by mapping \(\phi\) to \(\int {f(x) \phi(x) dx}\in \mathbb{R}\).

Similarly, higher order distributional derivative of integrable generalized functions \(f\), \(f^{(n)}\) are transformed to the reciprocal higher order derivatives of the infinitely differentiable test functions \(\phi\), i.e., \(\phi^{(n)}\)

\[f^{(n)}: \phi \longrightarrow -\int_{-\infty}^\infty f(x) \, \phi^{(n)}(x)\, dx\ .\]

Even if \(f\) is not smooth or differentiable, it suffices to be integrable to define its distributional (generalized or weak) derivatives of any order.

For instance, the delta functional is not even a function, however, its distributional derivative is well-defined.

This property provides a strong motivation for introducing generalized functions, as they are differentiable and expand the space of potential functions representing solutions to difficult differential equations, which may not permit classical function solutions. Distributions also may enhance the search for classical equation solutions, just like linting dimensionality in many data science problems provides meaningful ways to solve difficult optimization problems and fit AI models, see TCIU Chapter 3 Appendix on RKHS. Proving the existence of a generalized function equation solution often provides clues to deriving a classical function solution from the distribution solution.

Let’s consider a heuristic example of generalized solutions yielding classical solutions. Consider an integral programming problem of minimization problem for an objective (cost) function with integer arguments, see DSPA Chapter 13 (Optimization). The more general optimization of the same objective function over the reals may be easier to find first using classical gradient descent minimization. Having the real-minimum, we can check the nearby integers to identify the desired integer solutions of the original integral programming problem. Generalized distributional derivatives can be used analogously to approximate the derivatives of difficult (non-smooth) functions. Looking for solutions over a larger state space is a common mathematical trick to estimate approximate solutions over the original (constrained) state space.

Generalized functions are related to probability distributions. Probability density functions are non-negative real-valued functions, \(f:\mathbb{R}\to\mathbb{R}^+\) that integrate to unity, \(\int f(x)dx=1\). All observed phenomena and events (sets of observable outcomes) can be represented by some (marginal, joint, or conditional) probability distributions by integrating the probability density function over the event/set.

The probability density also facilitates the calculation of the expected value of a test function by integrating the function against the probability density. So, distributions are generalized functions that can be integrated or differentiated similarly to integrating or differentiating test functions acted upon the distributions.

Of course, the values of probability density functions are just real numbers, not probabilities, but the densities are integrants that yield the desired probabilities. Similarly, by themselves, generalized functions are not functions but linear operators that act on special kinds of (test) functions to yield real numbers.

The class of test functions operated on by distributions can vary. The most useful class of test functions is the space of infinitely differentiable functions with compact support, \(\mathcal{C}_c^{\infty}(\mathbb{R})\). In Fourier analysis, the set of Schwartz functions

\[\mathcal{S}(\mathbb{R^n})=\{f\in C^{\infty}(\mathbb{R^n}): ||f||_{\alpha,\beta}<\infty \,, \forall \alpha, \beta\} \tag{1}\]

represent another useful class of natural test functions consisting of all infinitely differentiable functions of rapid decay, i.e., functions \(\phi\) such that \(x^n \phi(x)\underset{x\to\pm\infty}{\longrightarrow} 0,\ \forall\ n\in\mathbb{N}\). This rapid decay conditions guarantees that the Fourier transforms of Schwartz functions are also Schwartz functions.

Note that compactly supported (time-limited) functions are not well-suited for Fourier analysis since their compact support guarantees that the support of their Fourier transforms will be infinite! This is due to the Heisenberg uncertainty principle indicating that the more concentrated a function is in spacetime, the less concentrated the support of its Fourier transform is in the (frequency domain) Fourier space. No function can be both time-limited and band-limited.

To draw direct connection to spacekime analytics, and specifically kime-representations of repeated measurement longitudinal data, we will review the more fundamental properties of generalized functions.

2 The Dirac Delta Distribution

The Dirac delta distribution \(\delta(x)\) represents the most basic generalized function (or distribution). It can be defined as a limit of test functions \[\delta_n(x) = \frac{1}{\sqrt{\pi\frac{1}{n^2}}} e^{-\frac{x^2}{(1/n)^2}} \underset{n\to\infty}{\longrightarrow} \delta(x)\ .\]

Clearly, these test functions are classical Gaussian distributions with progressively decaying variance \(\sigma_n^2=\frac{1}{n^2}\).

library(plotly)
x <- seq(from=-4, to=4, length.out=2000)
indexN <- seq(1:10)
y <- list()

p <- plot_ly(type = "scatter", mode="lines")
for (i in indexN) {
  y[[i]] <- dnorm(x, mean=0, sd=1/indexN[i])
  p <- p %>% add_trace(x=x, y=y[[i]], 
                       hovertemplate = paste0("Distribution=N(mu=0,sd=1/", indexN[i], ")\n", "(%{x}, %{y})"),
                       name=paste0("N(mu=0,sd=1/", indexN[i], ")"))
}
p <- p %>% 
  layout(title="Gaussian Test Functions Approximating Dirac Delta",
     xaxis=list(title="Critical Values"), yaxis=list(title="Densities"))
p

Using the Taylor series expansion of \(f\), the convolution of \(\delta_n(x)\) with any bounded and continuous function, \(\forall\ f(x)\in C^{\infty}(\Omega), \ \Omega\subseteq \mathbb{R}^d\),

\[\underbrace{(f*\delta)}_{convolution}(x_o) \overset{n\to\infty}{\longleftarrow} (f*\delta_n)(x_o)=\int_{-\infty}^{\infty} {f(x)\delta_n(x-x_o)\ dx}\ \overset{n\to\infty}{\approx} f(x_o)\underbrace{\int_{-\infty}^{\infty} {\delta_n(x-x_o)\ dx}}_ {1}=f(x_o)\ .\]

Hence we can define the generalized function (distribution) \(\delta\) as a linear functional over \(C^{\infty}(\Omega)\), i.e., \(\delta\) is the dual of all bounded and continuous functions, \(\delta:C^{\infty}(\Omega)\longrightarrow C^{\infty}(\Omega)\), acting by

\[\delta(f)(x_o)\equiv (f*\delta)(x_o)\equiv f(x_o),\ \forall f\in C^{\infty}(\Omega),\ \forall x_o\in\Omega \ .\]

The generalized Dirac delta function picks out the value of the function (\(f\)) it maps at the specified point* in the domain \(x_o\).

A linear functional \(\mathcal{L}\) is a linear mapping from a vector space to a field, e.g., \(\mathcal{L}:V\longrightarrow\mathbb{C}\) where \[\mathcal{L}(\alpha_1 v_1 + \alpha_2 v_2)= \alpha_1 \mathcal{L}(v_1) + \alpha_2 \mathcal{L}(v_2),\ \forall v_1,v_2\in V,\ \forall \alpha_1,\alpha_2\in\mathbb{C}.\]

Recall that the differential operator is a linear functional from the vectors space of differentiable functions to the reals (or complex plane), \(\forall d\in C^1(\mathbb{R}),\ d(x)\longrightarrow d'(0)\in\mathbb{R}\).

Similarly, the Dirac delta linear functional \(\delta:C^{\infty}(\Omega)\longrightarrow \mathbb{C}\) is a linear functional acting by \(f(x) \longrightarrow (f*\delta)(x_o)=f(x_o)\). More generally, all bounded and integrable functions \(p(x)\) are also linear functions, \(p:C^{\infty}(\Omega)\longrightarrow \mathbb{C}\) acting by mapping \(C^{\infty}(\Omega)\ni f(x) \longrightarrow \int_{-\infty}^{\infty} {f(x)p(x)\ dx}\in\mathbb{R}\). Note that most of the time the reals represent the base field, \(\mathbb{R}\subseteq \mathbb{C}\).

We should recognize the duality of \(p(x)\) as being a function in \(C^{\infty}(\Omega)\) and a functional transforming functions to scalars in the base field.

Let’s define the set of test functions as the set of infinitely differentiable functions with bounded support, \(\mathcal{D}=C_0^{\infty}(\Omega)\). Then, a distribution \(d\) is a continuous linear functional \(d(\phi): \mathcal{D}\longrightarrow \mathbb{R}\), such that

\[d(\alpha_1 \phi_1 + \alpha_2 \phi_2)= \alpha_1 d(\phi_1) + \alpha_2 d(\phi_2),\ \forall \phi_1,\phi_2\in \mathcal{D},\ \forall \alpha_1,\alpha_2\in\mathbb{C}\ .\]

Relations between distribution and functions:

  • All continuous functions \(d(x)\) are also distributions acting by \(\mathcal{D}\ni \phi(x) \overset{g}{\longrightarrow} \int_{-\infty}^{\infty} {d(x)\phi(x)\ dx}\in\mathbb{R}\).
  • All distributions can be approximated by functions, \(\exists \{d_n(x)\}_n\in \mathcal{D}\) such that \(\forall \phi\in\mathcal{D}\), \(d(\phi) = \underset{n\to\infty}{\lim} {\int_{\mathbb{R}}d_n(x)\phi(x)\ dx}\).

2.1 First-principles examples of distributions

  • Linear mixtures of delta functionals are distributions. For instance, \(d(x)=-2\delta(x+3) + \delta(x)\), \(\forall \phi\in\mathcal{D}\), \(d(\phi)=\int_{-\infty}^{\infty} {d(x)\phi(x)\ dx}=-2\phi(-3)+\phi(0)\in\mathbb{R}\).

  • Heaviside (step function) \(H(x)=\begin{cases} 0 & x\leq 0 \\ 1 & x\gt 0 \end{cases}\) is a generalized function because \(\forall \phi\in\mathcal{D}\),

\[d(\phi)\equiv H(\phi)\equiv \int_{-\infty}^{\infty} {H(x)\times(-x)\phi'(x)\ dx}\\ = \int_{0}^{\infty} {(-x)\phi'(x)\ dx} \underset{by\ parts}{=} \underbrace{-x\phi(x)\Biggr|_{0}^{\infty}}_{0} - (-1)\int_{0}^{\infty} {\phi(x)\ dx}\\ \equiv \int_{-\infty}^{\infty} {H(x)\phi(x)\ dx}\ .\]

2.2 Distributional derivatives

For mode details see the TCIU section on Radon-Nikodym derivatives.

Let’s demonstrate a couple of examples of differentiating non-differentiable functions interpreted as distributions. The distributional derivative of \(d(\phi)\) is defined to be the (derivative) distribution \(d'(\phi)\equiv-d(\phi')\). Assume there is a smooth function \(d(x)\) such that \(d(\phi)=\underset{\mathbb{R}}{\int}{d(x)\phi(x)\ dx}\). Then, integration by parts yields

\[d'(\phi)\underset{def}{=} \underset{\mathbb{R}}{\int}{d'(x)\phi(x)\ dx} \underset{d'(\phi)\equiv-d(\phi')}{=} -\underset{\mathbb{R}}{\int}{d(x)\phi'(x)\ dx}= -d(\phi'),\ \forall \phi\in\mathcal{D}\ .\]

  • Distributional derivative of the Heavyside (step) function, \(H(x)\): \(H'(\phi)\underset{def}{=} -\underset{\mathbb{R}}{\int}{H(x)\phi'(x)\ dx}= \int_{0}^{\infty}{\phi'(x)\ dx}\\ =\phi(0)=\int_{-\infty}^{\infty}{\delta(x-0)\phi(x)\ dx}=\int_{-\infty}^{\infty}{H'(x)\phi(x)\ dx},\ \forall \phi\in\mathcal{D}\ .\) Therefore, the distributional derivative of the Heavyside function is the Dirac generalized function \(H'(x)=\delta(x)\).

  • Distributional derivative of the Dirac delta functional, \(\delta'(\phi)=-\delta(\phi')=-\phi'(0)\).

2.2.1 Higher-order distributional derivatives

The higher-order distributional derivatives are defined inductively using repeated integration by parts.

\[d^{(n)}(\phi)\underset{def}{=} \underset{\mathbb{R}}{\int}{d^{(n)}(x)\phi(x)\ dx} = (-1)^n d(\phi^{(n)}),\ \forall \phi\in\mathcal{D}\ .\]

Example 1: Second distributional derivative of the Heaviside functional:

\[H''(\phi) = \left (H'(\phi)\right )' = \left (\delta(\phi)\right )'= \delta'(\phi)=-\phi'(0),\ \forall \phi\in\mathcal{D}\ .\]

Example 2: Next, let’s look at the second-order derivative of \(f(x) = |x|\), which is clearly not differentiable as a function at the origin. Observe that \(\forall \phi\in\mathcal{D}\), splitting the domain into two regions and integrating by parts gives

\[f''(\phi) = (-1)^2 f(\phi'')= \int_{\mathbb{R}}{f(x)\phi''(x)\ dx}= \int_{-\infty}^0{(-x)\phi''(x)\ dx}+\int_0^{\infty}{x\phi''(x)\ dx}=\\ \underbrace{-x\phi'(x)\Biggr|_{-\infty}^0}_{0} + \underbrace{x\phi'(x)\Biggr|_{0}^{\infty}}_{0}+ \underbrace{\phi(x)\Biggr|_{-\infty}^0}_{\phi(0)} - \underbrace{\phi(x)\Biggr|_{0}^{\infty}}_{\phi(0)}=2\phi(0)\ .\]

Therefore, the second order distributional derivative \(|x|''=2\delta(x)\).

3 Formal Definitions of Distributions

Often in practice it is more meaningful to discuss physical quantities in terms of generalized functions using \(\int f(x)\varphi(x)dx\) rather than using specific functional values, \(f(x)\).

The space of test functions is denoted by \(D(\Omega)\ni \{\psi\}\) and represents the space of smooth functions with compact support on a domain \(\Omega\). A test function \(\psi: \Omega \to \mathbb{R}\) has following properties:

  • It vanishes outside a bounded subset of \(\Omega\) that stays away from the boundary of \(\Omega\), \(\partial \Omega\).
  • It has continuous derivatives of all orders.

Then, the definition of a generalized function, i.e., a distribution, is based on the space of test functions. There are three equivalent definitions of distributions.

3.1 Distributions as continuous linear functionals

The class of distributions is all continuous linear functionals on \(\Omega\) mapping from the space of test functions \(D(\Omega)\) to real or complex numbers.

Notes:

  1. The mapping \(T: D(\Omega)\to \mathbb{C}\) should be linear. \[T_f(a\varphi_1 + b\varphi_2 )\equiv \langle f, a\varphi_1 + b\varphi_2 \rangle = a\langle f, \varphi_1\rangle + b\langle f,\varphi_2\rangle\ .\] for any test functions \(\{\varphi_1, \varphi_2\} \in D(\Omega)\), the space of test functions.

  2. The mapping \(T\), note that we can suppress the continuous function \(f\) in the subindex \(T_f\), should also be continuous:

For any test function \(\varphi \in D(\Omega)\) and a corresponding sequence of test functions \(\{ \varphi_{n} \} \in D(\Omega)\) with \(\varphi_n\stackrel{D}\longrightarrow\varphi\), \(T(\varphi_n)\longrightarrow T(\varphi)\) is always true.

  1. All continuous functions \(f\) are distributions, but not vice-versa.

Denote \(T_f = \langle f, \varphi \rangle\), which is a linear and continuous mapping from \(D\) to real or complex numbers. Thus, each continuous function \(f\) represents a distribution. Whereas the Dirac \(\delta\) distribution is not a continuous function.

3.2 Distributions as a limit of ordinary functions

Since we could approximate an arbitrary distribution using a sequence of test functions, there is a new way to define the distribution.

Consider a series of test functions \({f_n}\) approximating a function \(f\). Define \(T_n\) to be the distribution series, where \(T_n (\varphi)= \langle\varphi_n,\varphi\rangle\). Then \(T_f = \langle f, \varphi\rangle = \lim_{n \to \infty}\langle \varphi_n,\varphi \rangle\) is a new distribution.

Similarly, we can also say that if \({f_n}\) is a sequence of distributions for which the limit as \(n\to\infty\) of \(\langle f_n, \varphi \rangle\) exists for all test functions \(\varphi\), then \(\langle f, \varphi\rangle = \lim_{n \to \infty} \langle f_n, \varphi \rangle\) defines a distribution as limiting (fixed-point), \(f_n \longrightarrow f\).

This is supported by the theorem that for any distribution \(f\), there always exists a test function sequence \(\{\varphi_n\}\) such that \(\{\varphi_n \}\longrightarrow f\).

Notations: There might exist an analogous relationship between the convergence of test functions to a distribution and the convergence of test samples to a population. We need to go further.

3.3 Convolution quotients

(This section still needs to be refined!)

Definition of the convolution quotient: \[(f * d)(x) = \int f(x-y)d(y)dy\]

Here we first introduce a concept of the Schwartz class S, which refers to the set of functions of which the all the partial derivatives are rapidly decreasing. There are two ways to define the rapidly decreasing functions:

  1. There exists constants \(M_N\), such that \[|f(x)| \leq M_N|x|^{-N} as\space x \longrightarrow \infty \] for \(N = 1, 2, 3...\).
  2. For any polynomial \(p(x)\), \(p(x)f(x)\) always goes to zero when \(x \longrightarrow \infty\).

\(D(\mathbb{R}^n)\) is the subset of \(S(\mathbb{R}^n)\), and \(e^{-|x|^2}\) is an example that belongs to \(S(\mathbb{R}^n)\) but doesn’t belong to \(D(\mathbb{R}^n)\) since it does not have bounded support.

  1. The concept of tempered distributions refers to the linear functionals defined on \(S\).

Here we can define an arbitrary tempered distribution \(g(x)\): \[g(x) = \psi(x) * f(x) = \int \psi(x-y)f(y)dy = \langle f, \tau_{-x}\tilde{\psi} \rangle ,\] where \(\tau_{-x}\tilde{\psi} = \psi(x-y)\), \(f,\psi \in S\).

This defines the tempered distributions by convolution quotient \(\psi(x) * f(x)\).

Example: \(S\) is the subset of the set of tempered distributions \(S'\). Take \(f = \delta\) and an arbitrary \(\psi \in S\), and get

\[\psi = \langle \delta, \tau_{-x}\tilde{\psi} \rangle = \psi * \delta(x) \in S'\ .\] Consider two test functions \(\phi\) and \(\psi\), where \(\phi\) is a convolution kernel, and \(\psi\) is a convolution operator. Here we define a new distribution f:

\[\langle f , \phi * \psi \rangle = \lim_{n \to \infty} \langle f_n, \phi * \psi \rangle .\]

Notation:

The above three definitions of distributions, as generalized functions, are equivalent.

For an arbitrary function \(f\) in (2), since the continuous function series \(\varphi_n\) uniformly converges to \(f\), \(f\) is also a continuous function, satisfying the first definition of distribution. And according to the theorem, for any distribution defined by the first definition, we could always find the test function sequence \(\{\varphi_n\}\) such that \(\{\varphi_n\}\) uniformly converges to \(f\) and \(f\) is a distribution defined by the second definition.

The third definition only defines tempered distributions, but it agrees with the first definition, which means that for any tempered distribution defined by the convolution quotient, it can also be written as the form in the first definition. Take an arbitrary \(g(x) = \psi(x) * f(x)\) for example

\[ \langle g, \varphi \rangle = \langle \psi * f, \varphi \rangle = \langle f, \tilde{\psi} * \varphi \rangle \in \mathbb{R} \space or\space \mathbb{C}, \]

where \(f\in S'\), \(\tilde{\psi} * {\phi} \in S\). This means that \(g\) represents the linear functionals mapping from the space of Swarz class to real or complex numbers.

Question: In the third definition, we considered a larger set of input functions, i.e. \(S\) instead of the set of test functions \(D\). Can we say these three definitions are equivalent?

4 Distribution Properties

4.1 Synergies between distributions and real numbers

The relationship between distributions and functions is analogous to the relation between real numbers and rational numbers.

Recall the Euler natural number approximation

\[\lim_{n \to \infty}\left (1 + \frac{1}{n}\right )^n = e,\] by a sequence of rational numbers \(\left (1 + \frac{1}{n}\right)^n\) approximating the real (transcendental) Euler number \(e\approx 2.718282\cdots\).

Similarly, infinite sequences of functions converge to distributions in the same sense. Take the Dirac delta function \(\delta\) as an example.

Consider a sequence of test functions \({f_k(x)}\) that satisfies the following three conditions:

  1. \(f_k(x) = 0\) unless \(|k| \leq \frac{1}{k}\).
  2. \(\int_{-\frac{1}{k}}^{\frac{1}{k}}f_k(x)dx = 1\).
  3. \(f_k\) is smooth.

Here we get \(\langle f_k, \varphi \rangle\) converges to \(\langle \delta ,\varphi \rangle\) so \[\lim_{n \to \infty}f_k = \delta\ .\]

Summarizing, both distributions and real numbers are extensions of (locally integrable) functions and rational numbers, respectively. That’s why the distributions are also called generalized functions.

For any locally integrable function \(f\), we associate a distribution \(f\), given by \(T_f = \langle f,\varphi \rangle\). For other distributions that are not locally integrable functions can be represented as the limit of the locally integrable function sequence, like \(\delta\).

4.2 Fourier transforms and Convolutions of tempered distributions

Recall the Schwartz class \(S\) and tempered distributions we mentioned before.

Since we need to consider Fourier transforms of the distributions and convolutions of distributions, we just consider tempered distributions here.

  1. \(\int \hat{f}(x)\varphi(x)dx = \int f(x)\hat{\varphi}(x)dx\), where ( f S^{’}$, $S), and \(\hat{f(\xi)} = \int f(x)e^{ix \xi}dx\). \(\hat{f}\) is the distribution Fourier transform of f.

  2. \(\langle \psi * f, \varphi \rangle = \langle f, \tilde{\psi} * \varphi \rangle\), where \(f \in S^{'}\), \(\varphi,\psi \in S\), and \(\tilde{\psi}(x) = \psi(-x)\).

4.3 The Structure of Distributions

Generally speaking, if \(T\) is a distribution, \(T\) can be written as a infinite sum of derivatives of functions:

\[T = \sum_{n=1}^{\infty}f_n^{(n)}(x)\ .\] For example, \[T = \sum_{n=1}^{\infty}\delta^{(n)}(x-n) \] is a distribution which can be written as an infinite sum of the derivatives related to the delta function.

Particularly, if \(T\) is a distribution with compact support, or a tempered distribution, \(T\) can be written as a finite sum of derivatives of functions.

\[T = \sum_{n=1}^{N}f_n^{(n)}(x) . \]

This leads to the Structure Theorem for \(D'\) and \(S'\)

Note: Here, \(f_n,\ \forall n= 1,2,3,\cdots\) can be different functions.

Structure Theorem for D’: If \(T\) is a distribution on \(R^n\), then there exists continuous functions \(f_{\alpha}\) such that

\[T = \sum_{\alpha} \left (\frac{\partial}{\partial x}\right )^{\alpha}f_{\alpha},\] where for every bounded open set \(\Omega\), all but a finite number of distributions \((\frac{\partial}{\partial x})^\alpha f_\alpha\) vanishes identically on \(\Omega\).

Here the right hand side is an infinite sum:

\[\left(\frac{\partial}{\partial x}\right )^{\alpha} = \left (\frac{\partial}{\partial x_1}\right )^{\alpha_1}\left (\frac{\partial}{\partial x_2}\right)^{\alpha_2}\cdots \left(\frac{\partial}{\partial x_n}\right)^{\alpha_n},\] and \[|\alpha| = \alpha_1+\alpha_2 + \cdots +\alpha_n \]

Structure Theorem for \(S'\): If \(T\) is a tempered distribution on \(R^n\), then there exists a finite number of continuous functions \(f_{\alpha}\) satisfying \[|f_{\alpha}(x)| \leq c(1+|x|)^n\ ,\]

such that \[T = \sum\left (\frac{\partial}{\partial x}\right )^{\alpha}f_{\alpha},\]

where the right hand side is a finite sum. The Structure Theorem allows us further define the order of distributions.

Note:

  1. This representation is not unique, and there may exist multiple representations of one distribution \(T\).

  2. Using the structure theorem, here we may introduce the concept - Order of Distribution - The highest order derivative in \(T = \sum(\frac{\partial}{\partial x})^{\alpha}f_{\alpha}\).

4.4 Positiveness of Distributions

Positiveness: If for any non-negative test functions \(\varphi\)(i.e. \(\varphi \geq 0\),

\[ T = \langle f, \varphi \rangle = \int f(x)\varphi(x)dx \geq 0 \]

is always true, then \(T\) is a positive distribution. Obviously, \(T\) is a positive distribution if and only if

\[\int f^{-}(x)dx = 0 \]

where \(f^{-}(x) = \max(-f(x), 0)\).

Example: \(\delta\) function is a positive function. For any \(\varphi \geq 0\), there is

\[\langle \delta, \varphi \rangle = \varphi(0) \geq 0\ .\]

Note: (1) The definition of positive distributions comes from positive measure. There is one-to-one correspondence between positive distribution \(T\) and positive measure \(\mu\), such that

\[\langle T, \varphi \rangle = \int \varphi d\mu\ .\]

Proof: Using Riesz Representation Theorem:

Let \(X\) be a locally compact Hausdroff space and \(\psi\) is a positive linear functional on \(C_c(X)\) - the space of compactly supported complex-valued continuous functions. Then there exists a Borel \(\sigma-\) algebra \(\Sigma\) on \(X\) and a unique positive Borel measure \(\mu\) on \(X\) such that

\[\psi(f) = \int_{X}f(x)d\mu(x), f \in C_c(X).\] By using Riesz Representation Theorem, we draw a connection between linear functionals and measures. The theorem tells us that any positive linear functional has a unique associated positive measure.

4.5 The Continuity of Distribution

Continuity at \(\varphi\): for every \(\epsilon > 0\), and \(R\) sufficiently large, there exists \(\delta(R), m(R)\), such that

\[|\langle T, \varphi \rangle - \langle T, \varphi_0 \rangle| \leq \epsilon,\] where

\[||(\frac{\partial}{\partial x})^\alpha\varphi - (\frac{\partial}{\partial x})^\alpha\varphi_0||_{\infty} = sup_{x} |(\frac{\partial}{\partial x})^\alpha\varphi - (\frac{\partial}{\partial x})^\alpha\varphi_0 | \leq \delta \]

for every \(\alpha \leq m\) and \(supp(\varphi_1), supp(\varphi_2) \subset \{|x| \leq R\}\).

Continuity of \(T\): For every test function \(\varphi\), distribution \(T\) is continuous at \(\varphi\). This also gives us a corollary that for continuous distribution \(T\), when \(\varphi_j \longrightarrow \varphi\), \(\langle T, \varphi_j \rangle \longrightarrow \langle T, \varphi \rangle\).

4.6 Probability Distribution and Distribution in the sense of Generalized Functions

Both probability distributions and distributions(generalized functions) try to solve the uneven distribution of physical quantities. The Riesz Representation Theorem establishes a link between linear functionals and measures. We try to find the associations between these two terms.

Recall the definition of the Probability Measure. Considering the general notation of measure, the probability measure \(\mu\) has additional requirements:

  1. \(0 \leq \mu\ \leq 1\), and \(\mu\) returns zero for the empty set and one for the entire space.

  2. \(\mu\) has the countable additivity property. If \(A_j, j = 1,...,+\infty\) are countable disjoints sets, there is

\[\mu(\cup_{j=1}^{\infty}E_i) = \sum_{i\in N}\mu(E_i) \ .\]

Consider the integral with respect to the probability measure:

\[\int fd\mu = \int f(x)dF(x),\] where \(F(x)\) is the c.d.f. of the corresponding probability measure. This expression can be regarded as an “expectation” of the function with respect to the probability measure. For each point \(x\) in the entire space, the integral assigns a weight(i.e.the probability) to each function f(x).

Recall what we discussed before in Positiveness of Distributions, there is a one-to-one correspondence between positive distributions \(T\) and positive measures \(\mu\). Each probability measure \(P\) on \(\Omega\) is a kind of positive measure with an additional requirement \(P(\Omega) = 1\). Thus, there is a one-to-one correspondence between a subset of positive distributions and probability measures.

For any probability measure \(\mu\), it can be regarded as a distribution and there exists a positive distribution \(T\) acting on \(D\), such that

\[T_\mu = \langle \mu, \varphi \rangle = \int\varphi d\mu.\]

The probability measure \(\mu\) assigns the whole sample space to be one. This is equivalent with \(\langle \mu, 1\rangle = 1\) in the distribution sense.

Notation: Actually, The constant function \(\varphi = 1\) isn’t an element in the space of test functions. More rigorously, this statement means that

\[\lim_{k \longrightarrow \infty} \langle \mu, \varphi_k \rangle = 1,\]

where \(\{\varphi_k\}\) is an arbitrary test function series that approximates the constant function \(\varphi = 1\).

The probability measure \(\mu\) also assigns the whole sample space to be zero. Obviously, this is equivalent with \(\langle \mu, 0 \rangle = 0\) in the distribution sense.

We already know that each probability measure \(\mu\) is associated with a positive distribution. And \(\mu\) also requires that \(\langle \mu, 1 \rangle = 1\) and \(\langle \mu, 0 \rangle = 0\). The latter one is always true for any distribution. Intuitively, we may draw a connection between positive distributions with \(\langle \mu, 1 \rangle = 1\) and probability measures:

There is a one-to-one correspondence between positive distribution with \(\langle \mu, 1 \rangle = 1\) and probability distribution.

This is a true statement supported by the true converse statement - For any positive distribution with \(\langle \mu, 1 \rangle = 1\), there is a corresponding probability measure \(\mu\).

Example: Consider the Heaviside function,

\[H(x) = \left\{ \begin{aligned} 1 & , & x\geq 0 \\ 0 & , & x < 0 \end{aligned} \right.\ ,\]

which is the cumulative distribution function of a constant random variable which is almost surely on 0. The corresponding probability measure here is the delta measure, where

\[\delta(A) = \left\{ \begin{aligned} 0 & , & 0 \notin A \\ 1 & , & 0 \in A \end{aligned} \right.\ .\]

The corresponding distribution here is the generalized function \(\delta\).

\[T(\varphi) = \int\delta(x)\varphi(x)dx = \int\varphi d\delta = \int\varphi(x)dH(x) = \varphi(0).\]

5 Random drawing (sampling) from distributions

5.1 George Marsaglia’s polar sampling method

George Marsaglia’s polar sampling method is a pseudo-random number sampling algorithm for generating a pair of independent standard normal random variable. Mind the strong relation to Cartesian representation of kime \(\kappa=(\kappa_1, \kappa_2)\in\mathbb{C}\)! The difference is that whereas the kime Cartesian coordinates could follow any bivariate distribution, Marsaglia polar sampling generates random kime coordinates \(\kappa_1, \kappa_2\sim N(0,1)\).

The Marsaglia polar random draw selects points in the unit square \(I\times I = \{(\kappa_1, \kappa_2)\in \mathbb{C} | 0\leq \kappa_1, \kappa_2\leq 1\}\) subject to the constraint until \(0 < z=\kappa_1^2 + \kappa_2^2 < 1\).

The corresponding random \(z\in\mathbb{C}\), i.e., corresponding pair of random normal variables is \(\left (\kappa_1\sqrt{\frac{-2\ln(z)}{z}}, \ \kappa_2\sqrt{\frac{-2\ln(z)}{z}}\right )\), \(\forall\ z\in\mathbb{C}\), s.t., \(0 < z=\kappa_1^2 + \kappa_2^2 < 1\).

Note that the components \(\frac{\kappa_i}{\sqrt{z}}\) represent the cosine and the sine of the kime-phase angle the angle between the vector \(\vec{\kappa}=(\kappa_1, \kappa_2)\) and the \(x\)-axis. This is because \(\kappa_i\sqrt{\frac{-2\ln(z)}{z}}\equiv \frac{\kappa_i}{\sqrt{z}}\sqrt{-2\ln(z)}\ , \ i\in\{1,2\}\).

Lemma. Suppose \(u\sim Unif(-1,1)\), then the point \(\kappa=\left (\underbrace{\cos(\pi u)}_{\kappa_1}, \underbrace{\sin(\pi u)}_{\kappa_2}\right )\in \mathbb{C}\) is uniformly distributed on the unit circle \(\{(\kappa_1, \kappa_2)\in \mathbb{C} | \kappa_1^2 + \kappa_2^2=1\}\).

Multiplying the point \(\kappa\in\mathbb{C}\) by an independent random variable (kime-magnitude, or time) \(\rho\gt 0\) whose distribution is \(P(\rho<a)=\int_0^a {r\ e^{-r^2/2}\ dr}\) yields a point

\[\rho\kappa\equiv \kappa'=\left (\underbrace{\rho\cos(\pi u)}_{\kappa'_1},\ \underbrace{\rho\sin(\pi u)}_{\kappa'_2}\right )\in \mathbb{C},\]

whose Cartesian coordinates are jointly bivariate normal, and their marginals are independent standard normal random variables, i.e., \(\kappa'_1,\kappa'_2\sim N(0,1)\) and

\[(\kappa'_1,\kappa'_2)\sim N\left (\mu=\begin{pmatrix} 0 \\ 0 \end{pmatrix}, \Sigma=\begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}\right )\ .\]

Proof. The rationale for this is based in the simple trick of deriving the bivariate normal distribution from the square of independent Normal marginals.

Let \(I=\int_{-\infty}^\infty e^{-\frac{\kappa_1^2}{2}}\ d\kappa_1\) and compute the square \[\underbrace{I^2}_{squared\\ distribution}=\left (\int_{-\infty}^\infty e^{-\frac{\kappa_1^2}{2}}\ d\kappa_1\right ) \times \left (\int_{-\infty}^\infty e^{-\frac{\kappa_1^2}{2}}\ d\kappa_1\right )\\ =\left (\int_{-\infty}^\infty e^{-\frac{\kappa_1^2}{2}}\ d\kappa_1\right ) \times \left (\int_{-\infty}^\infty e^{-\frac{\kappa_2^2}{2}}\ d\kappa_2\right )\\ =\int_{-\infty}^\infty\int_{-\infty}^\infty e^{-\frac{\kappa_1^2+\kappa_2^2}{2}}\ d\kappa_1d\kappa_2 \\ \underbrace{=}_{Polar\\ coord} \int_{-\pi}^\pi\int_{0}^\infty \rho\ e^{-\frac{\rho^2}{2}}\ d\rho d\theta = 2\pi \int_{0}^\infty \rho\ e^{-\frac{\rho^2}{2}}\ d\rho\ .\]

Observe that the polar coordinate transformation yields a constant uniform kime-phase distribution \(\theta\sim Unif(-\pi,\pi)\), whereas the density of the radial distance \(\rho^2\) is \(\rho\ e^{-\frac{\rho^2}{2}}\sim \chi^2_{df=2}\).

Therefore, the bivariate Cartesian coordinates \[\kappa'_1=\rho\cos(\pi u)\ {\text{and }} \kappa'_2=\rho\sin(\pi u)\sim N(0,1)\ .\] Task Expand this approach of producing a pair of independent standard normal variables by radial projection of uniformly random points from the unit circle at a distance representing the square root of a \(\chi^2_2\) variable to non-uniform symmetric, circular distributions, such as the (truncated) Laplace kime-phase distribution.

Note: This is related to the Box–Muller transform

\[\kappa_1=\sqrt{\underbrace{-2\ln(\overbrace{u_1}^{\sim Unif(0,1)})}_{\sim \chi^2_1}} \cos\left (\pi \underbrace{u_2}_{\sim Unif(-1,1)}\right ),\\ \kappa_2=\sqrt{\underbrace{-2\ln(\overbrace{u_1}^{\sim Unif(0,1)})}_{\sim \chi^2_1}} \sin\left (\pi \underbrace{u_2}_{\sim Unif(-1,1)}\right )\ .\]

Using the Euler’s formula \(e^{i \kappa} = \cos(\kappa) + i \sin(\kappa)\), let \(\rho = \sqrt{-2 \ln(u_1)}\) and \(\kappa = \pi u_2\). Then

\[re^{i \kappa} = \sqrt{-2 \ln(u_1)} e^{i \pi u_2} = \sqrt{-2 \ln(u_1)}\left (\cos(\pi u_2) + i \sin(\pi u_2)\right )\ .\]

A random bivariate point \((\kappa_1,\kappa_2)\) inside the unit disk is projected onto the unit circle by setting \(z=\kappa_1^2+\kappa_2^2\) and generating the polar random variable \(\left( \frac{\kappa_1}{\sqrt{z}}, \frac{\kappa_2}{\sqrt{z}} \right)\) without computing the cosine and sine functional values. The random point on the circumference is then radially projected along the random kime-magnitude (time) displacement by \(\sqrt{-2\ln(z)}\), where \(z\sim Unif(0,1)\) is independent of the random point on the circumference.

In this example, we generate \(1,000\) kime samples using the Box-Muller transform.

set.seed(2)
u1 = runif(1000, min = 0, max = 1)
u2 = runif(1000, min = 0, max = 1)
s = -log(u1)
r = sqrt(2*s)
theta = 2*pi*u2
theta2 = theta * 180/pi
k1 = sqrt(2*s) * cos(theta)
k2 = sqrt(2*s) * sin(theta)


library(plotly)
data1 = data.frame(r = r, theta = theta2)
scatter_polar = plot_ly(data1, r = ~r, theta = ~theta, type = 'scatterpolar', 
                        mode = 'markers',name = "Polar Scatter Plot", color="green")


data2 = data.frame(kappa_1 = k1, kappa_2 = k2)
scatter_plot = plot_ly(data2, x = ~kappa_1, y = ~kappa_2, type = 'scatter', 
                       mode='markers', name="Cartesian Scatter Plot", color="green")
layout_polar <- list(polar = list(radialaxis = "Kime Magnitude",
                                  angularaxis = "Kime Phase"), showlegend=TRUE)
scatter_polar <- scatter_polar %>% layout(layout_polar)

layout_plot <- list(xaxis = "Kappa 1", yaxis = "Kappa 2", showlegend=TRUE)

scatter_plot <- scatter_plot %>% layout(layout_plot)

scatter_polar
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels
scatter_plot
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

We can also use Box-Miller to generate the SVG plots.

library(animation)
library(circular)
library(plotly)
epsilon <- 0.1
sampleSize <- 1000   # total number of phases to sample for 3 different processes (x, y, z)
sizePerTime <- 100   # number of phases to use for each fixed time (must divide sampleSize)
circleUniformPhi <- seq(from=-pi, to=pi, length.out=sizePerTime)
totalTimes <- as.integer(sampleSize/sizePerTime)

u <- seq(from=0, to=1, length.out=sizePerTime)  # Uniform[0,1]
v <- seq(from=0, to=1, length.out=sampleSize %/% sizePerTime)  # Uniform[0,1]
kappa1 <- cos(pi * u) # Cartesian kime coordinates kappa 1 (x-axis)
kappa2 <- sin(pi * u) # kappa 2 (y-axis)
# kappaGrid <- matrix(expand.grid(u, v))
kappaGrid <- expand.grid(u, v)

oopt = ani.options(interval = 0.2)
set.seed(1234)
# sample the the kime-phases for all 3 different processes (A,B,C) and the r time points
A <- rvonmises(n=sampleSize, mu=circular(pi/5), kappa=3)
B <- rvonmises(n=sampleSize, mu=circular(-pi/3), kappa=5)
C <- rvonmises(n=sampleSize, mu=circular(0), kappa=10)
r <- seq(from=1, to=sampleSize/sizePerTime, length.out=10)

# Define a function that renormalizes the kime-phase to [-pi, pi)
pheRenormalize <- function (x) {
  out <- ifelse(as.numeric(x) <= pi, as.numeric(x)+pi, as.numeric(x)-pi)
  return (out)
}

# transform Von Mises samples from [0, 2*pi) to [-pi, pi)
A <- pheRenormalize(A)
B <- pheRenormalize(B)
C <- pheRenormalize(C)

# vectorize the samples
vectorA = as.vector(A)
vectorB = as.vector(B)
vectorC = as.vector(C)
# Starting phases, set the first phase index=1
plotA = c(vectorA[1])
plotB = c(vectorB[1])
plotC = c(vectorC[1])

# pl_list <- list()
pl_scene <- plot_ly(type='scatter3d', mode="markers")
plotA <- list() 
plotB <- list() 
plotC <- list() 

plotA_df <- list()   # need separate data frames to store all time foliations
plotB_df <- list()
plotC_df <- list()

for (t in 1:length(r)) {  # loop over time
  # loop over kime-phases
  plotA[[t]] <- as.numeric(A[c(( (t-1)*length(r) + 1):((t-1)*length(r) + sizePerTime))])
  plotB[[t]] <- as.numeric(B[c(( (t-1)*length(r) + 1):((t-1)*length(r) + sizePerTime))])
  plotC[[t]] <- as.numeric(C[c(( (t-1)*length(r) + 1):((t-1)*length(r) + sizePerTime))])
  
  tempA = circular(unlist(plotA[[t]]))
  tempB = circular(unlist(plotB[[t]]))
  tempC = circular(unlist(plotC[[t]]))
  
  resA <- density(tempA, bw=25, xaxt='n', yaxt='n')
  resB <- density(tempB, bw=25, xaxt='n', yaxt='n')
  resC <- density(tempC, bw=25, xaxt='n', yaxt='n')

  unifPhi_df <- as.data.frame(cbind(t=t, circleUniformPhi=circleUniformPhi))
  plotA_df[[t]] <- as.data.frame(cbind(t=t, plotA=unlist(plotA[[t]])))
  plotB_df[[t]] <- as.data.frame(cbind(t=t, plotB=unlist(plotB[[t]])))
  plotC_df[[t]] <- as.data.frame(cbind(t=t, plotC=unlist(plotC[[t]])))
  
  pl_scene <- pl_scene %>% add_trace(data=unifPhi_df, showlegend=FALSE,
                      x = ~((t-epsilon)*cos(circleUniformPhi)), 
                      y = ~((t-epsilon)*sin(circleUniformPhi)), z=0,
                      name=paste0("Time=",t), line=list(color='gray'),
                      mode = 'lines', opacity=0.3) %>%
    add_markers(data=plotA_df[[t]], x=~(t*cos(plotA)), y=~(t*sin(plotA)), z=0,
                      type='scatter3d', name=paste0("A: t=",t),
                      marker=list(color='green'), showlegend=FALSE,
                      mode = 'markers', opacity=0.3) %>%
    add_markers(data=plotB_df[[t]], x=~((t+epsilon)*cos(plotB)),
                    y=~((t+epsilon)*sin(plotB)), z=0-epsilon, showlegend=FALSE,
                    type='scatter3d', name=paste0("B: t=",t), 
                    marker=list(color='blue'),
                    mode = 'markers', opacity=0.3) %>%
    add_markers(data=plotC_df[[t]], x=~((t+2*epsilon)*cos(plotC)),
                y=~((t+2*epsilon)*sin(plotC)), z=0+epsilon, showlegend=FALSE,
                type='scatter3d', name=paste0("C: t=",t), 
                marker=list(color='red'),
                mode = 'markers', opacity=0.3)
} 
# pl_scene

means_df <- as.data.frame(cbind(t = c(1:length(r)),
                                plotA_means=unlist(lapply(plotA, mean)),
                                plotB_means=unlist(lapply(plotB, mean)),
                                plotC_means=unlist(lapply(plotC, mean))))
pl_scene <- pl_scene %>% 
  # add averaged (denoised) phase trajectories
  add_trace(data=means_df, x=~(t*cos(plotA_means)),
        y=~(t*sin(plotA_means)), z=0,
        type='scatter3d', showlegend=FALSE, mode='lines', name="Expected Obs A",
        line=list(color='green', width=15), opacity=0.8) %>%
  add_trace(data=means_df, x=~(t*cos(plotB_means)), 
        y=~(t*sin(plotB_means)), z=0-epsilon,
        type='scatter3d', showlegend=FALSE, mode='lines', name="Expected Obs B", 
        line=list(color='blue', width=15), opacity=0.8) %>%
  add_trace(data=means_df, x=~(t*cos(plotC_means)), 
        y=~(t*sin(plotC_means)), z=0+epsilon,
        type='scatter3d', showlegend=FALSE, mode='lines', name="Expected Obs C", 
        line=list(color='red', width=15), opacity=0.8) %>%
  add_trace(x=0, y=0, z=c(-2,2), name="Space", showlegend=FALSE,
             line=list(color='gray', width=15), opacity=0.8) %>%
  # Add Uniform Grid [-1,1] * [0,1]
  add_trace(x=kappaGrid[[1]], y=kappaGrid[[2]], z=0, name="Uniform Grid", showlegend=FALSE,
             marker=list(color='orange', width=15), opacity=0.8) %>%
  layout(title="Pseudo Spacekime (1D Space, 2D Kime) Kime-Phase Sampling and Foliation",
          scene = list(xaxis=list(title="Kappa1"), yaxis=list(title="Kappa2"),
                        zaxis=list(title="Space"))) %>% hide_colorbar()
pl_scene
#################################################################
# Organize data - dfXYZP - dataframe, 
# (X,Y,Z) Cartesian Coordinates, t=time, P=Phi, (A,B,C) Processes
#################################################################
df <- array(0, c(length(r), sizePerTime, 6))
dfXYZP <- data.frame()

for (t in 1:length(r)) {   # loop over time
  for (phi in 1:sizePerTime)  { # loop over phase index
    for (coord in 1:6) {     # loop over X,Y,Z coordinates, t, phi, Phi_Value
      if (coord==1)      df[t, phi, coord]  <- A[(t-1)*sizePerTime + phi]  # X process
      else if (coord==2) df[t, phi, coord]  <- B[(t-1)*sizePerTime + phi]
      else if (coord==3) df[t, phi, coord]  <- C[(t-1)*sizePerTime + phi]
      else if (coord==4) df[t, phi, coord]  <- t    # t: Parametric Grid
      else if (coord==5) df[t, phi, coord]  <- phi  # Phi: Parametric Grid
      else df[t, phi, coord]  <- circleUniformPhi[phi]  # actual value of the phase Phi
      
      ind = (t-1)*sizePerTime + phi  # dfXYZP row index
      if (coord <= 3) {  # The 3 simulated processes
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 1] <- 
              t*cos(df[t, phi, coord]) ## X    (X-Cartesian Coordinate)
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 2] <- 
              t*sin(phi) ## Y    (Y-Cartesian Coordinate)
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 3] <- 
              (4-coord)*epsilon ## Z    (Z-Cartesian Coordinate)
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 4] <- t ## t  (kime-magnitude)
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 5] <- phi ## Phi  (kime-phase)
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 6] <- coord  ## Process (color)
      } else if (coord==4) {   # t indexing
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 1] <- t # phi/sizePerTime # U[0,1]
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 2] <- t # (t-1)/length(r) # U[0,1]
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 3] <- t # z
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 4] <- t # t
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 5] <- t # phi
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 6] <- coord
      } else if (coord==4) {   # phi indexing
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 1] <- phi/sizePerTime
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 2] <- phi
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 3] <- phi
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 4] <- phi
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 5] <- phi
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 6] <- coord
      } else if (coord==5) {   # phi Value
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 1] <- phi/sizePerTime
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 2] <- phi
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 3] <- phi
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 4] <- t
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 5] <- phi 
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 6] <- circleUniformPhi[phi] # phi # ?
      } else {  # if (coord==6) #  Uniform[0,1] * Uniform[0,1] parametric Grid
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 1] <- u[phi] * (sampleSize %/% sizePerTime)
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 2] <- v[t] * (sampleSize %/% sizePerTime)
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 3] <- 0
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 4] <- t
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 5] <- phi
        dfXYZP[length(r)*sizePerTime*(coord-1) + ind, 6] <- coord
      }
    }
  }
} 
colnames(dfXYZP) <- c("X", "Y", "Z", "t", "Phi", "Process")
pl_scene <- dfXYZP[-c(3001:5000), ] %>%
  highlight_key(~Phi) %>% plot_ly(type='scatter3d', mode="markers+text") %>%
  add_markers(x=~X, y=~Y, z=~Z, text=~Process, 
              name=~paste0("Process: ", Process), #, "<br>t: ", t, "<br>Phi: ", Phi, "<extra></extra>"), 
              showlegend=FALSE, color=~Process, textposition="top", opacity=0.3,
              hoverinfo = ~paste("Process: ", Process, "<br>X: %{x}<br>",
                      "Y: %{y}<br>", "Z: %{z}<br>", 
                      "t: ", t, "<br>Phi: ", Phi, "<extra></extra>")) %>% #  "x+y+color") %>%
  highlight(on = "plotly_hover", off = "plotly_doubleclick", dynamic=TRUE) %>%
  layout(title="Marsaglia Polar Random Kime-Phase Sampling \n Three Different Processes (Colors)\n Yellow Grid shows the Time * Phase indexing Corresponding to Kime\n Mouse-over Shows the Corresponding Longitudinal Trajectories for a Fixed Phase Index",
          scene = list(xaxis=list(title="Kappa1"), yaxis=list(title="Kappa2"),
                        zaxis=list(title="Process"))) %>% hide_colorbar()
pl_scene
## 3D array to a dataframe
# library(reshape2)
# df1 <- melt(df)  # Var1=time, Var2=Phase, Var3=Process, value=Value
# colnames(df1)<- c("time", "phi", "Process", "value")
# df2 <- df1[, -1]  # "Time * phi", "Process", "value"
# 
# pl_scene <- plot_ly(data=dfXYZP, type='scatter3d', mode="markers")
# 
# pl_scene <- dfXYZP %>%
#   highlight_key(~Phi) %>% plot_ly(type='scatter3d', mode="markers+text") %>%
#   add_markers(x=~X, y=~Y, z=~Z, text=~Process, name=~paste0("Process: ", Process), 
#               marker=list(color=~as.factor(Process)), showlegend=FALSE,
#               textposition = "top", hoverinfo = "x+y", opacity=0.3) %>%
#   highlight(on = "plotly_hover", off = "plotly_doubleclick") 
#     # %>%
#     # add_trace(x=0, y=0,  z=c(-2,2), name="Space", showlegend=FALSE,
#     #           line=list(color='gray', width=15), opacity=0.8) # Vertical Space
# 
# pl_scene

# pl_scene <- plot_ly(data = df, type='scatter3d', mode="markers")
# 
# for (t in 1:length(r)) {   # loop over time  # t=1
#   pl_scene <- as.data.frame(df[t,,]) %>%
#     highlight_key(~V1) %>% plot_ly(type='scatter3d', mode="markers+text") %>%
#     add_markers(x=~t*cos(V2), y=~t*sin(V2), text = ~V2,
#                   z=0+epsilon/2, type='scatter3d', name=paste0("X: t=",t), # Process 1
#                   marker=list(color='green'), showlegend=FALSE,
#                   textposition = "top", hoverinfo = "x+y", opacity=0.3) %>%
#     add_markers(x=~(t+epsilon)*cos(V3), text = ~V3,                # Process 2
#                   y=~(t+epsilon)*sin(V3), z=0-epsilon, showlegend=FALSE,
#                   name=paste0("Y: t=",t), 
#                   marker=list(color='blue'), mode='markers', opacity=0.3)  %>%
#     add_markers(x=~(t+2*epsilon)*cos(V4),               # Process 3
#                   y=~(t+2*epsilon)*sin(V4), z=0+epsilon, showlegend=FALSE,
#                   type='scatter3d', name=paste0("Z: t=",t), 
#                   marker=list(color='red'), mode='markers', opacity=0.3) %>%
#     add_markers(x=~(t*epsilon)*V5,   # Process 4 (Grid)
#                   y=t, z=0, showlegend=FALSE,
#                   type='scatter3d', name=paste0("Grid: t=",t), 
#                   marker=list(color='lightgray'), mode='markers', opacity=0.3) %>%
#     highlight(on = "plotly_hover", off = "plotly_doubleclick") 
#     # %>%
#     # add_trace(x=0, y=0,  z=c(-2,2), name="Space", showlegend=FALSE,
#     #           line=list(color='gray', width=15), opacity=0.8) # Vertical Space
# }
# 
# pl_scene

# pl_scene <- plot_ly(type='scatter3d', mode="markers")
# 
# pl_scene <- 
#     pl_scene %>% add_trace(data=unifPhi_df, showlegend=FALSE, # Radial Circles
#                       x = ~((t-epsilon)*cos(circleUniformPhi)), 
#                       y = ~((t-epsilon)*sin(circleUniformPhi)), z=0,
#                       name=paste0("Time=",t), line=list(color='gray'),
#                       mode = 'lines', opacity=0.3) %>%
#     add_markers(x=(t*cos(df[t, , 1])), y=(t*sin(df[t, , 1])), 
#                 z=0+epsilon/2, type='scatter3d', name=paste0("X: t=",t), # Process 1
#                 marker=list(color='green'), showlegend=FALSE,
#                 mode = 'markers', opacity=0.3) %>%
#     add_markers(x=(t+epsilon)*cos(df[t, , 2]),                 # Process 2
#                 y=(t+epsilon)*sin(df[t, , 2]), z=0-epsilon, showlegend=FALSE,
#                 type='scatter3d', name=paste0("Y: t=",t), 
#                     marker=list(color='blue'), mode='markers', opacity=0.3) %>%
#     add_markers(x=(t+2*epsilon)*cos(df[t, , 3]),               # Process 3
#                 y=(t+2*epsilon)*sin(df[t, , 3]), z=0+epsilon, showlegend=FALSE,
#                 type='scatter3d', name=paste0("Z: t=",t), 
#                 marker=list(color='red'), mode='markers', opacity=0.3) %>%
#     add_markers(x=(t*epsilon)*(df[t, , 5]),   # Process 4 (Grid)
#                 y=t, z=0, showlegend=FALSE,
#                 type='scatter3d', name=paste0("Grid: t=",t), 
#                 marker=list(color='lightgray'), mode='markers', opacity=0.3) %>%
#   add_trace(x=0, y=0, z=c(-2,2), name="Space", showlegend=FALSE,   # Vertical Space
#              line=list(color='gray', width=15), opacity=0.8) 
# pl_scene

pl_scene <- pl_scene %>% # Add Uniform Grid [-1,1] * [0,1]
  add_trace(x=5*kappaGrid[[1]], y=5*kappaGrid[[2]], z=0, name="Uniform Grid", 
            showlegend=FALSE, marker=list(color='orange',width=15), opacity=0.8)
  
for (t in 1:length(r)) {   # loop over time
  tempA = circular(unlist(df[t, , 1]))  # X Process
  tempB = circular(unlist(df[t, , 2]))  # Y
  tempC = circular(unlist(df[t, , 3]))  # Z
  unifPhi_df <- as.data.frame(cbind(t=t, circleUniformPhi=circleUniformPhi))
  
  pl_scene <- 
    pl_scene %>% add_trace(data=unifPhi_df, showlegend=FALSE, # Radial Circles
                      x = ~((t-epsilon)*cos(circleUniformPhi)), 
                      y = ~((t-epsilon)*sin(circleUniformPhi)), z=0,
                      name=paste0("Time=",t), line=list(color='gray'),
                      mode = 'lines', opacity=0.3) %>%
    add_markers(x=(t*cos(df[t, , 1])), y=(t*sin(df[t, , 1])), 
                z=0, type='scatter3d', name=paste0("X: t=",t), # Process 1
                marker=list(color='green'), showlegend=FALSE,
                mode = 'markers', opacity=0.3) %>%
    add_markers(x=(t+epsilon)*cos(df[t, , 2]),                 # Process 2
                y=(t+epsilon)*sin(df[t, , 2]), z=0-epsilon, showlegend=FALSE,
                type='scatter3d', name=paste0("Y: t=",t), 
                    marker=list(color='blue'), mode='markers', opacity=0.3) %>%
    add_markers(x=(t+2*epsilon)*cos(df[t, , 3]),               # Process 3
                y=(t+2*epsilon)*sin(df[t, , 3]), z=0+epsilon, showlegend=FALSE,
                type='scatter3d', name=paste0("Z: t=",t), 
                marker=list(color='red'), mode='markers', opacity=0.3) %>%
    add_markers(x=(t*epsilon)*(df[t, , 4]/sizePerTime),   # Process 4 (Grid)
                y=t, z=0+2*epsilon, showlegend=FALSE,
                type='scatter3d', name=paste0("Grid: t=",t), 
                marker=list(color='lightgray'), mode='markers', opacity=0.3)
}
SOCR Resource Visitor number Web Analytics SOCR Email