MISRA-C diary(C言語日誌)


C言語 翻訳段階

C言語 翻訳段階



* 空でないソースファイルは、改行文字で終了しなければならない。[S}
* さらにこの改行文字の直前に(接合を行う前の時点で)逆斜線文字があってはならない。[S]

* ソースファイルは、前処理字句の途中又は注釈の途中で終了してはならない。[S}
+ 改行文字をのぞく空白類文字の並びを保持するか一つの空白文字に置き換えるかは処理系定義とする。

+ 字句連結の結果として生成される文字の並びが国際文字名の構文規則に一致する場合、その動作は未定義とする。

+ 対応する要素がない場合、ナル(ワイド)文字以外の処理系定義の要素に変換する。

(6) 連結(文字列リテラル)
  隣接する文字列リテラル字句同士を連結する 。

(7) 翻訳

(8) 連係(オブジェクト、関数)





6.4 Lexical elements
Atoken is the minimal lexical element of the language in translation phases 7 and 8.
The categories of tokens are: keywords, identifiers, constants, string literals, and punctuators.
A preprocessing token is the minimal lexical element of the language in translation phases 3 through 6.
The categories of preprocessing tokens are: header names,
identifiers, preprocessing numbers, character constants, string literals, punctuators, and
single non-white-space characters that do not lexically match the other preprocessing
token categories.69) If a ' or a " character matches the last category, the behavior is
undefined. Preprocessing tokens can be separated by white space; this consists of
comments (described later), or white-space characters (space, horizontal tab, new-line, vertical tab, and form-feed), or both. 
As described in 6.10, in certain circumstances during translation phase 4,
69) An additional category, placemarkers, is used internally in translation phase 4 (see; it cannot occur in source files.

6.4.1  Keywords
The above tokens (case sensitive) are reserved (in translation phases 7 and 8) for use as keywords, and shall not be used otherwise. 
The keyword _Imaginary is reserved for specifying imaginary types.70)

When preprocessing tokens are converted to tokens during translation phase 7, if a preprocessing token could be converted to either a keyword or an identifier, it is converted to a keyword. Predefined identifiers
This name is encoded as if the implicit declaration had been written in the source
character set and then translated into the execution character set as indicated in translation phase 5.

6.4.5 String literals

In translation phase 6, the multibyte character sequences specified by any sequence of
adjacent character and identically-prefixed string literal tokens are concatenated into a
single multibyte character sequence. If any of the tokens has an encoding prefix, the
resulting multibyte character sequence is treated as having the same prefix; otherwise, it
is treated as a character string literal. Whether differently-prefixed wide string literal
tokens can be concatenated and, if so, the treatment of the resulting multibyte character
sequence are implementation-defined.
6 In translation phase 7, a byte or code of value zero is appended to each multibyte
character sequence that results from a string literal or literals.78)

6.4.8 Preprocessing numbers
A preprocessing number does not have type or a value; it acquires both after a successful
conversion (as part of translation phase 7) to a floating constant token or an integer
constant token.

6.10 Preprocessing directives
Apreprocessing directive consists of a sequence of preprocessing tokens that satisfies the
following constraints: The first token in the sequence is a # preprocessing token that (at
the start of translation phase 4) is either the first character in the source file (optionally
after white space containing no new-line characters) or that follows white space
containing at least one new-line character.

The only white-space characters that shall appear between preprocessing tokens within a
preprocessing directive (from just after the introducing # preprocessing token through
just before the terminating new-line character) are space and horizontal-tab (including
spaces that have replaced comments or possibly other white-space characters in
translation phase 3).

#define EMPTY
EMPTY # include <file.h>
the sequence of preprocessing tokens on the second line is not a preprocessing directive, because it does not
begin with a # at the start of translation phase 4, even though it will do so after the macro EMPTY has been

6.10.1 Conditional inclusion

166) Because the controlling constant expression is evaluated during translation phase 4, all identifiers
either are or are not macro names — there simply are no keywords, enumeration constants, etc.

167) Thus, on an implementation where INT_MAX is 0x7FFF and UINT_MAX is 0xFFFF, the constant
0x8000 is signed and positive within a #if expression even though it would be unsigned in
translation phase 7.

6.10.2 Source file inclusion

170) Note that adjacent string literals are not concatenated into a single string literal (see the translation
phases in; thus, an expansion that results in two string literals is an invalid directive.

6.10.3 Macro replacement
171) Since, by macro-replacement time, all character constants and string literals are preprocessing tokens,
not sequences possibly containing identifier-like subsequences (see, translation phases), they
are never scanned for macro names or parameters. The ## operator
173) Placemarker preprocessing tokens do not appear in the syntax because they are temporary entities that
exist only within translation phase 4. Scope of macro definitions
A macro definition lasts (independent of block structure) until a corresponding #undef
directive is encountered or (if none is encountered) until the end of the preprocessing
translation unit. Macro definitions have no significance after translation phase 4.

6.10.4 Line control
The line number of the current source line is one greater than the number of new-line
characters read or introduced in translation phase 1 ( while processing the source
file to the current token.

6.10.9 Pragma operator
The resulting sequence of characters is processed through translation phase 3 to produce
preprocessing tokens that are executed as if they were the pp-tokens in a pragma

J.2 Undefined behavior
A reserved keyword token is used in translation phase 7 or 8 for some purpose other
than as a keyword (6.4.1)

J.3 Implementation-defined behavior
J.3.1 Translation
— Whether each nonempty sequence of white-space characters other than new-line is
retained or replaced by one space character in translation phase 3 (

J.3.2 Environment
1 — The mapping between physical source file multibyte characters and the source
character set in translation phase 1 (