In mathematics, a smooth maximum of an indexed familyx1, ..., xn of numbers is a smooth approximation to the maximum function meaning a parametric family of functions such that for every α, the function is smooth, and the family converges to the maximum function as . The concept of smooth minimum is similarly defined. In many cases, a single family approximates both: maximum as the parameter goes to positive infinity, minimum as the parameter goes to negative infinity; in symbols, as and as . The term can also be used loosely for a specific smooth function that behaves similarly to a maximum, without necessarily being part of a parametrized family.
Examples
Boltzmann operator
For large positive values of the parameter , the following formulation is a smooth, differentiable approximation of the maximum function. For negative values of the parameter that are large in absolute value, it approximates the minimum.
as Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "http://localhost:6011/en.wikipedia.org/v1/":): {\displaystyle \alpha\to -\infty}
The gradient of is closely related to softmax and is given by
This makes the softmax function useful for optimization techniques that use gradient descent.
It is a non-expansive operator. As , it acts like a maximum. As , it acts like an arithmetic mean. As , it acts like a minimum. This operator can be viewed as a particular instantiation of the quasi-arithmetic mean. It can also be derived from information theoretical principles as a way of regularizing policies with a cost function defined by KL divergence. The operator has previously been utilized in other areas, such as power engineering.[2]
^Biswas, Koushik; Kumar, Sandeep; Banerjee, Shilpak; Ashish Kumar Pandey (2021). "SMU: Smooth activation function for deep networks using smoothing maximum technique". arXiv:2111.04682 [cs.LG].