Differing behavior by identically-coded programs when compiled
In computer programming, unspecified behavior is behavior that may vary on different implementations of a programming language.[clarification needed] A program can be said to contain unspecified behavior when its source code may produce an executable that exhibits different behavior when compiled on a different compiler, or on the same compiler with different settings, or indeed in different parts of the same executable. While the respective language standards or specifications may impose a range of possible behaviors, the exact behavior depends on the implementation and may not be completely determined upon examination of the program's source code.[1] Unspecified behavior will often not manifest itself in the resulting program's external behavior, but it may sometimes lead to differing outputs or results, potentially causing portability problems.
Definition
To enable compilers to produce optimal code for their respective target platforms, programming language standards do not always impose a certain specific behavior for a given source code construct.[2] Failing to explicitly define the exact behavior of every possible program is not considered an error or weakness in the language specification, and doing so would be infeasible.[1] In the C and C++ languages, such non-portable constructs are generally grouped into three categories: Implementation-defined, unspecified, and undefined behavior.[3]
The exact definition of unspecified behavior varies. In C++, it is defined as "behavior, for a well-formed program construct and correct data, that depends on the implementation."[4] The C++ Standard also notes that the range of possible behaviors is usually provided.[4] Unlike implementation-defined behavior, there is no requirement for the implementation to document its behavior.[4] Similarly, the C Standard defines it as behavior for which the standard "provides two or more possibilities and imposes no further requirements on which is chosen in any instance".[5] Unspecified behavior is different from undefined behavior. The latter is typically a result of an erroneous program construct or data, and no requirements are placed on the translation or execution of such constructs.[6]
Implementation-defined behavior
C and C++ distinguish implementation-defined behavior from unspecified behavior. For implementation-defined behavior, the implementation must choose a particular behavior and document it. An example in C/C++ is the size of integer data types. The choice of behavior must be consistent with the documented behavior within a given execution of the program.
Examples
Order of evaluation of subexpressions
Many programming languages do not specify the order of evaluation of the sub-expressions of a complete expression. This non-determinism can allow optimal implementations for specific platforms e.g. to utilise parallelism. If one or more of the sub-expressions has side effects, then the result of evaluating the full-expression may be different depending on the order of evaluation of the sub-expressions.[1] For example, given
, where f
and g
both modify b
, the result stored in a
may be different depending on whether f(b)
or g(b)
is evaluated first.[1] In the C and C++ languages, this also applies to function arguments. Example:[2]
#include <iostream>
int f() {
std::cout << "In f\n";
return 3;
}
int g() {
std::cout << "In g\n";
return 4;
}
int sum(int i, int j) {
return i + j;
}
int main() {
return sum(f(), g());
}
The resulting program will write its two lines of output in an unspecified order.[2] In some other languages, such as Java, the order of evaluation of operands and function arguments is explicitly defined.[7]
See also
References