
CONTENTS
Calculus
dy
dx
Derivative of y with respect to x
∂y
∂x
Partial derivative of y with respect to x
∇
x
y Gradient of y with respect to x
∇
X
y Matrix derivatives of y with respect to x
∂f
∂x
Jacobian matrix J ∈ R
m×n
of a function f : R
n
→ R
m
H(f)(x) The Hessian matrix of f at input point x
Z
f(x)dx Definite integral over the entire domain of x
Z
S
f(x)dx Definite integral with respect to x over the set S
Probability and Information Theory
a⊥b The random variables a and b are independent.
a⊥b | c They are are conditionally independent given c.
E
x∼P
[f(x)] or Ef(x) Expectation of f (x) with respect to P (x)
Var(f(x)) Variance of f(x) under P (x)
Cov(f(x), g(x)) Covariance of f(x) and g(x) under P (x, y)
H(x) Shannon entropy of the random variable x
D
KL
(P kQ) Kullback-Leibler divergence of P and Q
Functions
f ◦ g Composition of the functions f and g
f(x; θ) A function of x parameterized by θ
log x Natural logarithm of x
σ(x) Logistic sigmoid, 1/(1 + exp(−x))
ζ(x) Softplus, log(1 + exp(x))
||x||
p
L
p
norm of x
x
+
Positive part of x, i.e., max(0, x)
1
condition
is 1 if the condition is true, 0 otherwise.
xi