# Interface Definition Language This document is a reference manual for the Thrift Interface Definition Language. It defines the syntax and semantics of Thrift files and defines constraints on the behavior of the generated code and libraries. It does not fully constrain the details of the generated code and libraries - and hence allows for flexibility in this regard. These details (which must be documented separately as an implementation reference) include: * the mapping between Thrift types and programming language types and APIs; * serialization and deserialization strategies; * network protocols for RPCs. The primary audience of this reference is Thrift users who want to understand the language in more detail. The secondary audience is Thrift implementers who want to learn about constraints imposed by the language on their implementation. ## Notation Thrift syntax is described using a [modified BNF grammar notation](https://docs.python.org/3/reference/introduction.html#notation). As an example, consider the grammar for a binary literal: ```grammar bin_literal ::= "0" ("b" | "B") bin_digit+ bin_digit ::= "0" | "1" ``` This rule defines `bin_literal` as starting with `0`, followed by a lowercase or uppercase `B` and one or more binary digits (`0` or `1`). ## Lexical Elements The Thrift language uses different lexical elements (tokens), such as identifiers, keywords, literals, and operators, as building blocks to define its syntax. This section describes what sequences of characters constitute valid tokens in the language. When parsing a Thrift file, tokens are generated by selecting the longest possible substring from the input text that conforms to the constraints of the grammar (maximal munch). The encoding of Thrift source files is UTF-8. ### Whitespace and Comments Whitespace is used to separate tokens in source files. The following characters are considered whitespace in Thrift: space (U+0020), horizontal tab (U+0009), line feed (U+000A) and carriage return (U+000D). Thrift supports single-line and multiline comments. Single-line comments start with `//` or `#` and continue until the end of the line which is determined as the nearest line feed (U+000A) or, if there is none, the end of file. Multiline comments start with `/*` and end with the nearest `*/`. Comments are treated as whitespace. Thrift recognizes Doxygen- and Javadoc-style docblocks for definitions. These start with `/// ` or `/** `. In addition, fields, parameters, and enum values can have inline docs which follow the definition and start with `///<` or `/**<`. Docblocks are exported in the AST and may be replicated in generated code. They are otherwise treated the same as regular comments. ### Identifiers Identifiers begin with an uppercase or lowercase letter `A` through `Z` or an underscore (`_`). Subsequent characters can also be digits `0` through `9`. Identifiers in Thrift are case-sensitive. ```grammar identifier ::= id_start id_continue* id_start ::= "a"..."z" | "A"..."Z" | "_" id_continue ::= id_start | digit ``` A qualified identifier is a sequence of two or more identifiers separated by periods. It can be used to refer to a definition in another Thrift file in which case the first component is the name of that file without extension and the second component is the name of the definition. For example, `search_types.Query` denotes the definition named `Query` in `search_types.thrift`. `maybe_qualified_id` denotes a qualified or unqualified identifier. ```grammar maybe_qualified_id ::= identifier ["." identifier]* ``` ### Keywords Thrift has two types of keywords: reserved words and context-sensitive keywords. The following are reserved words in Thrift - and cannot be used as identifiers: ```thrift binary false map struct bool float namespace throws byte hs_include optional true const i16 performs typedef cpp_include i32 required union double i64 service void enum include set exception interaction stream extends list string ``` Context-sensitive keywords listed below can be used as identifiers: ```thrift client package safe stateful idempotent permanent server transient oneway readonly sink ``` Programming languages for which code is generated by the Thrift compiler will have their own additional reserved words. Thrift implementations are permitted to disallow use of these reserved words. The implementation reference must specify how such reserved words are treated. ### Literals A literal is a representation of a fixed value in source code. Thrift supports integer, floating-point, string and Boolean literals. ```grammar literal ::= int_literal | float_literal | string_literal | bool_literal int_literal ::= dec_literal | bin_literal | oct_literal | hex_literal dec_literal ::= "1"..."9" digit* bin_literal ::= "0" ("b" | "B") bin_digit+ oct_literal ::= "0" oct_digit* hex_literal ::= "0" ("x" | "X") hex_digit+ digit ::= "0"..."9" bin_digit ::= "0" | "1" oct_digit ::= "0"..."7" hex_digit ::= digit | "a"..."f" | "A"..."F" float_literal ::= digit+ "." digit+ | (digit+ ["." digit+]) exponent exponent ::= ("e" | "E") ["+" | "-"] digit+ string_literal ::= '"' (dstring_char | escape_seq)* '"' | "'" (sstring_char | escape_seq)* "'" dstring_char ::= sstring_char ::= escape_seq ::= simple_escape | hex_escape | unicode_escape simple_escape ::= "\\" ("\n" | "\\" | "'" | "\"" | "n" | "r" | "t") hex_escape ::= "\\x" hex_digit hex_digit unicode_escape ::= "\\u" hex_digit hex_digit hex_digit hex_digit bool_literal ::= "true" | "false" ``` Both single quoted and double quoted string literals support the following escape sequences: | Escape Sequence | Meaning | | --------------- | ----------------------------- | | `\\` | Backslash (U+005C) | | `\'` | Single quote (U+0027) | | `\"` | Double quote (U+0022) | | `\n` | Line feed (U+000A) | | `\r` | Carriage return (U+000D) | | `\t` | Horizontal tab (U+0009) | | `\xhh` | Code unit with hex value *hh* | | `\uhhhh` | Unicode scalar value *hhhh* | where `h` denotes a hexadecimal digit. A backslash (`\`) at the end of a line in a multiline string is removed from the string together with the newline character (U+000A). Here are some examples of literals in Thrift: ```thrift 42 // decimal integer literal 0xCAFE // hexadecimal integer literal 3.14159 // floating-point literal "Don't panic!" // string literal using double quotes '\u2665 of Gold' // string literal using single quotes true // Boolean literal ``` Literals in Thrift do not have exact types. Instead they are represented using maximum precision and the actual type is inferred from context. For example: ```thrift const i32 BIG = 42; const i16 LITTLE = 42; ``` The literal `42` is treated as a 32-bit integer in the definition of `BIG` and as a 16-bit integer in the definition of `LITTLE`. ### Operators and Punctuation The following tokens serve as operators and punctuation in Thrift: ``` ( ) { } [ ] < > , ; @ = + - ``` ## Thrift Files A Thrift file starts with an optional package declaration and a, possibly empty, sequence of include and namespace directives. It is followed by a sequence of definitions which can also be empty. There can be at most one package declaration and it is normally placed at the beginning of a source file. ```grammar thrift_file ::= (include_directive | package_declaration | namespace_directive)* definition* ``` Here is an example of a Thrift file: ```thrift // Allows the definitions in search_types.thrift to be used here qualified // with the prefix `search_types.`. include "common/if/search_types.thrift" // Directs the compiler to place the generated C++ code inside the // namespace `facebook::peoplesearch`, the generated Java code inside // the package `com.facebook.peoplesearch` and similarly for other // languages. Also specifies the prefix for universal names. package "facebook.com/peoplesearch" // The port where the server listens for requests. This is a convenient // way to define constants visible to code in all languages. const i32 PORT = 3456; // The default number of search results used below as the default value // for `numResults`. const i32 DEFAULT_NUM_RESULTS = 10; // The parameter to the `search` RPC. Contains two fields - the query and // the number of results desired. struct PeopleSearchRequest { 1: search_types.Query query; 2: i32 numResults = DEFAULT_NUM_RESULTS; } // The response from the `search` RPC. Contains the list of results. struct PeopleSearchResponse { 1: list results; } /* * The service definition. This defines a single RPC `search`. */ service PeopleSearch { PeopleSearchResponse search(1: PeopleSearchRequest request); } ``` ### Include Directives ```grammar include_directive ::= ("include" | "cpp_include" | "hs_include") string_literal [";"] ``` Include directives allow the use of constants, types, and services from other Thrift files by prefixing the name (without extension) of the included Thrift file followed by a period. `cpp_include` instructs the compiler to emit an include in generated C++ code and `hs_include` does the same for Haskell. ```thrift include "common/if/search_types.thrift" struct PeopleSearchRequest { // The type `Query` is defined in search_types.thrift. 1: search_types.Query query; 2: i32 numResults = DEFAULT_NUM_RESULTS; } struct PeopleSearchResponse { // The type `PersonMetadata` is defined in search_types.thrift. 1: list results; } ``` If there is a circular dependency between files, a compile-time error is reported. So `a.thrift` cannot include itself, and cannot include `b.thrift` if `b.thrift` includes `a.thrift`. Including multiple files with a common ancestor is okay - so `a.thrift` can include `b.thrift` and `c.thrift` when both `b.thrift` and `c.thrift` include `d.thrift`. Including a file only provides access to symbols defined directly in that file; if `a.thrift` only includes `b.thrift` and `b.thrift` includes `d.thrift` then `a.thrift` can use symbols starting with `b.` but not `d.`. Referencing the latter without the corresponding include directive is an error in some target languages and is deprecated behavior in Thrift. ### Package Declaration A package declaration determines the default namespaces for target languages, e.g. the namespace for the generated C++ code and the package for Java. It is also used for applying file-level annotations and as the default [universal name](../features/universal-name.md) prefix. ```grammar package_declaration ::= [annotations] "package" package_name [";"] ``` A package name is a string containing a domain name and a path separated with `/`, for example: ```thrift package "meta.com/search" ``` The lexical structure of a package name is defined below. ```grammar package_name ::= '"' domain_path '"' | "'" domain_path "'" domain_path ::= domain "/" path domain ::= domain_prefix "." identifier domain_prefix ::= identifier ("." identifier)* path ::= identifier ("/" identifier)* ``` Let `namespace_path` denote `path` where every `/` is replaced with `.` and `reverse(d)` denote the domain `d` with components in reverse order. Then the default namespaces are derived from the package name in the following way. * C++ (`cpp2`): The namespace is a concatenation of `reverse(domain_prefix)` and `namespace_path` separated by a period (`.`). * Python (`python`, `py3`): The namespace is a concatenation of `reverse(domain_prefix)` and `namespace_path` with the last path component (and preceding `.`) removed if it is equal to the Thrift file name without extension. The domain prefix and the path are separated by a period. * Hack: The namespace is the value of `namespace_path`. * Java (`java.swift`): The namespace is a concatenation of `reverse(domain)` and `namespace_path` separated by a period. Other Thrift implementations must define their own mappings from package names to namespaces. Here is an example with the package name containing the Thrift file without extension (`query`): ```thrift title="search/query.thrift" package "meta.com/search/query" ``` This gives the following default namespaces: | Language | Target(s) | Namespace | | ----------- | --------------- | ----------------------- | | C++ | `cpp2` | `meta.search.query` | | Python | `python`, `py3` | `meta.search` | | Hack | `hack` | `meta.search.query` | | Java | `java.swift` | `com.meta.search.query` | Here is an example with the package name not containing the file name component: ```thrift title="search/query.thrift" package "meta.com/search" ``` | Language | Target(s) | Namespace | | ----------- | --------------- | ----------------- | | C++ | `cpp2` | `meta.search` | | Python | `python`, `py3` | `meta.search` | | Hack | `hack` | `meta.search` | | Java | `java.swift` | `com.meta.search` | In both cases the Python namespaces don't include the file name component. The default namespace can be overriden for individual target languages with the [namespace directive](#namespace-directives). Annotations on a package declaration apply to the whole file. For example: ```thrift @cpp.TerseWrite package "facebook.com/peoplesearch" struct PeopleSearchRequest { 1: search_types.Query query; 2: i32 numResults = DEFAULT_NUM_RESULTS; } ``` This enables terse writes for `query` and `numResults` fields even though they are not annotated with `@cpp.TerseWrite` themselves. The package name is also used as a prefix for universal names. For example: ```thrift package "facebook.com/peoplesearch" // Has the universal name "facebook.com/peoplesearch/PeopleSearchRequest". struct PeopleSearchRequest { /* ... */ } ``` ### Namespace Directives ```grammar namespace_directive ::= "namespace" maybe_qualified_id namespace_name [";"] namespace_name ::= maybe_qualified_id | string_literal ``` Namespace directives override the default namespaces set by a [package declaration](#package-declaration). Namespaces control the top-level structure of the generated code in a language-specific way. They do not affect the Thrift file semantics. ```thrift // Directs the compiler to generate C++ code inside the namespace // `facebook::peoplesearch`. namespace cpp2 facebook.peoplesearch // Directs the compiler to generate Java code inside the package // `com.facebook.peoplesearch`. namespace java.swift com.facebook.peoplesearch ``` ## Definitions Thrift supports three kinds of definitions: type definitions, interface definitions and constant definitions. ```grammar definition ::= type_definition | interface_definition | constant_definition type_definition ::= struct | union | exception | enum | typedef interface_definition ::= service | interaction ``` Definitions appear at the top level of a Thrift file. A common property of all definitions is that they introduce a name (an identifier) that can be used to denote an entity such as a type. Definition names must be unique within a file. The name introduced by definitions can be used anywhere in the Thrift file (either before or after the location of the definition), and in other Thrift files that include this Thrift file. ### Structs A struct definition introduces a named struct type into your program and has the following form: ```grammar struct ::= [annotations] "struct" identifier "{" field* "}" field ::= [annotations] field_id ":" [field_qualifier] type identifier [default_value] [";"] field_id ::= integer ``` *The struct type is the centerpiece of the Thrift language. It is the basic unit of serialization and versioning in the language.* This section defines the primary aspects of structs. Separate sections define [qualifiers](#field-qualifiers) and [default values](#default-values) for fields, constraints on serialization of structs, and compatibility between different versions of struct types. Structs in Thrift look similar to structs in C++ though fields in Thrift have a field id and optionally a qualifier. They are significantly different in that the order of appearance of fields is not important and there is no expectations on the memory layout of the objects in the corresponding programming language code. ```thrift struct PersonMetadata { 1: string name; 2: i32 age; 3: list friendIds; } ``` Every field has a type and a name (an identifier) that can be used to denote the field. In addition every field also has an integer id that can be used to denote the field. Field ids and field names must be unique within a struct. While the Thrift language does not impose restrictions on how the id and name are used, the *primary usage* has the id denoting the field in the serialized data and the name denoting the field in the generated code. This primary usage is key to the value provided by Thrift in terms of versioning support and compact serialized data size. The *primary usage* described above can be assumed for the large majority of Thrift usage. The exception to this is that Thrift does offer a JSON serialization strategy where the name is used to denote the field in the serialized data and the id is not used. This strategy is used to define config schemas (e.g. in [Configerator](https://sigops.org/s/conferences/sosp/2015/current/2015-Monterey/printable/008-tang.pdf)) and should not be mixed with other usage where the id is used instead. The field ids should be in the range of 1 through 32767, and they do not have to appear in order. They normally start with 1 although this is not required. Thrift does not support structure inheritance but it is possible to use composition to achieve a similar goal. As a consequence of the above rules, `PersonMetadata1` below defines a type equivalent to `PersonMetadata` above: ```thrift // Just a different ordering of fields. struct PersonMetadata1 { 2: i32 age; 1: string name; 3: list friendIds; } ``` However, the following two types are both legal but different from `PersonMetadata`: ```thrift // Some ids are different. struct PersonMetadata2 { 590: string name; 36: i32 age; 3: list friendIds; } // Some names are different. struct PersonMetadata3 { 1: string firstName; 2: i32 age; 3: list friends; } ``` :::caution Do not reuse ids. If a field is removed, it's a good practice to comment it out as a reminder not to reuse it. ::: ### Unions A union definition introduces a named union type into your program and has the following form: ```grammar union ::= [annotations] "union" identifier "{" field* "}" ``` Unions are identical to structs in all ways, except for the following differences: * Unions use the reserved word `union` instead of `struct`. * All fields must be unqualified, but they are equivalent to [optional struct fields](#optional-fields). * At most one field can be *present*. The concepts "optional" and "present" are described in [Optional Fields](#optional-fields). The serialized representation of unions is identical to that of structs that have a single field present. When deserializing into a union, the serialized data may not provide a value to more than one of the fields of the union. In other words, it is possible to serialize from a union and deserialize into a compatible struct, and vice versa. The generated code for unions can be different from that of structs - for example, implementations may choose to use programming language unions in generated code for better memory efficiency. :::note It is possible for none of the fields to be present in a union. ::: ### Exceptions An exception definition introduces a named exception type into your program and has the following form: ```grammar exception ::= [annotations] [error_safety] [error_kind] [error_blame] "exception" identifier "{" field* "}" error_safety ::= "safe" error_kind ::= "transient" | "stateful" | "permanent" error_blame ::= "client" | "server" ``` Exceptions are identical to structs in all ways, except that these types should only be used as types within the `throws` clause of functions. This is not enforced at the moment. The serialized representation of exceptions is identical to that of structs. It is possible to serialize from an exception and deserialize into a compatible struct, and vice versa. The generated code for exceptions can be different from that of structs. For example, programming language exceptions can be used in generated code. Exceptions support the following qualifiers: * Fault attribution: `client` or `server` * Error classification: `transient`, `stateful` or `permanent` * Error safety: `safe` (vs. unspecified) For information how to use them, see [Errors and Exceptions](/features/exception.md). ### Enums An enum definition introduces a named enumeration type into your program and has the following form: ```grammar enum ::= [annotations] "enum" identifier "{" (enumerator [","])* "}" enumerator ::= identifier "=" integer ``` The enumerators (the named constants) must be explicitly bound to an integer value. The identifier after the reserved word `enum` may be used to denote the enumeration type. ```thrift enum SearchKind { UNKNOWN = 0, // default value PEOPLE = 1, PAGES = 3, GROUPS = 5, } ``` :::note Because the default value for every enum is 0 (even if you do not define an enumerator for 0), it is recommended to include an `UNKNOWN` entry with the value 0 and use it to indicate that the client or server didn't provide the value. This way missing values will be represented as `UNKNOWN` instead of something meaningful that happens to be defined to zero in the IDL. ::: :::caution Enums are treated like integers by Thrift, if you send a value which is not listed in the IDL, the receiver may not check or convert the value to the default (it will just have an out of range value) or it may throw an exception on access. This can happen when a new client is talking to an old server. ::: :::caution Removing and adding enum values can be dangerous - see [Schema Compatibility](/features/compatibility.md). ::: ### Typedefs A typedef introduces a named alias of a type and has the following form: ```grammar typedef ::= [annotations] "typedef" type identifier [";"] ``` It can be used to provide a simpler way to access complex types, for example: ```thrift typedef map StringMap ``` ### Services A service definition introduces a named service into your program and has the following form: ```grammar service ::= [annotations] "service" identifier ["extends" base_service_name] "{" (function | performs)* "}" function ::= [annotations] [function_qualifier] return_clause identifier "(" (parameter [","])* ")" [throws] [";"] base_service_name ::= maybe_qualified_id function_qualifier ::= "oneway" | "idempotent" | "readonly" return_clause ::= return_type ["," (sink | stream)] | interaction_name ["," return_type] ["," (sink | stream)] return_type ::= type | "void" initial_response_type ::= type sink ::= "sink" "<" type [throws], type [throws] ">" stream ::= "stream" "<" type [throws] ">" throws ::= "throws" "(" (parameter [","])* ")" parameter ::= [annotations] field_id ":" type identifier [default_value] performs ::= "performs" interaction_name ";" interaction_name ::= maybe_qualified_id ``` An interface for RPC is defined in a Thrift file as a service. Example: ```thrift struct SearchRequest { 1: string query; 2: i32 numResults; } struct SearchResponse { 1: list results; } safe exception SearchException { 1: string errorMessage; 2: i64 errorCode; } service Search { SearchResponse search(1: SearchRequest request) throws (1: SearchException e); } ``` A **service** is an interface for RPC defined in Thrift. Each service has a set of functions. Each function has a name, which must be unique within the service, and takes a list of parameters. It can return normally with a result if the result type is not `void` or it can throw one of the listed application exceptions. In addition, the function can throw a Thrift system exception if there was some underlying problem with the RPC itself. The parameters in the `throws` clause must have exception types. If a functions throws one of the exceptions given in this clause, then all of the members of this exception will be serialized and sent over the wire. For other undeclared exceptions only the message will be serialized and they will appear on the client side as `TApplicationException`. Function parameters are similar to struct fields except that they don't take field qualifiers, meaning that **parameters cannot be optional**. The proper way to achieve this is to use a struct type parameter, which itself then may contain an `optional` field. Functions support the following **function qualifiers**: - `oneway`: the client does not expect response back from server, - `idempotent`: safe to retry immediately after a transient failure, - `readonly`: always safe to retry. See [Errors and Exceptions](/features/exception.md) for more information how to use `readonly` and `idempotent` qualifiers for automatic retries. Functions that use the `oneway` qualifier (oneway functions) are "fire and forget". It means that the client sends the function parameters to the server, but does not wait or expect a result. Therefore oneway functions must use `void` as the return type and must not have a `throws` clause. ```thrift service Logger { // Log an informational message without getting a confirmation of success. oneway void logInfo(1: string message); } ``` :::caution Oneway methods may be silently dropped and are therefore discouraged for most use cases. ::: Services may extend (inherit from) other services. The set of functions in the inherited (base) service is included in the inheriting service. The name of the base service is given as a, possibly qualified, identifier after the reserved word `extends`. Service names are visible after the end of the service definition in which they have been introduced, and in other Thrift files that include this Thrift file. :::caution New parameters could be added to a method, but it is better to define an input struct and add members to it instead. The server cannot distinguish between missing parameters and default values, so a request object is better. ::: #### Streaming A function containing `stream` in its declaration establises a server-to-client stream when called. A **stream** is a communication abstraction between a client and server, where a server acts as the producer and the client acts as the consumer. It allows the controlled flow of ordered messages from the server to the client. All messages in the stream have the same payload object type `T` also known as the stream element type. The function may also return an initial response specified in the IDL. The client can choose to cancel the stream at any time. The server can terminate the stream by sending an exception. A function containing `sink` in its declaration establishes a client-to-server stream. A **sink** is similar to a stream, but the client acts as the producer and the server acts as the consumer. It allows the flow of ordered messages of type `T` from the client to the server. It may initially return an initial response specified in the IDL, and it always returns a final response of type `U`. The client will wait for a final response back from the server marking the completion of the sink. The client can terminate the sink by sending an exception to the server. The server can also terminate the sink by sending an exception while consuming payloads. The exception acts as the termination of the sink. Both `sink` and `stream` may be preceded by a return type which specifies the initial response type. The initial response type cannot be `void` but can be omitted. Each response type can optionally have a list of declared exceptions associated with it. Example: ```thrift struct GetFileRequest { 1: string fileName; 2: i64 chunkSize; } struct GetFileResponse { 1: i64 fileSize; } struct FileChunk { 1: binary data; } service FileServer { // Returns a GetFileResponse object and establishes // a server-to-client stream of FileChunk objects. GetFileResponse, stream getFile(1: GetFileRequest request); } ``` Refer to [Thrift Streaming](/fb/features/streaming/index.md) for more information. ### Interactions An interaction definition introduces a named interaction into your program and has the following form: ```grammar interaction ::= [annotations] "interaction" identifier "{" function* "}" ``` An **interaction** is a multi-request contract that keeps related methods together, manages states, and integrates with the routing layer to ensure those requests are sent over the same connection to the same host. It can be created using a **factory function** that returns an interaction or a **constructor** using reserved word `performs` that generates a non-RPC constructor on the client and server. The server also listens to a **termination signal** that immediately is invoked instead of waiting for outstanding requests and streams unlike the destructor. Using annotations, it supports serial interaction that limits processing to a single method at a time and event base threading model that directly processes on the I/O thread instead of the server executor. Refer to [Interactions](/fb/features/interactions.md) for more information. ### Constants A constant definition introduces a named constant into your program and has the following form: ```grammar constant_definition ::= [annotations] "const" type identifier "=" initializer [";"] initializer ::= integer | float | string_literal | bool_literal | maybe_qualified_id | list_initializer | map_initializer integer ::= ["+" | "-"] int_literal float ::= ["+" | "-"] float_literal ``` The constant name can be used instead of the value after the completion of the constant definition and in other Thrift files that include this Thrift file. A constant can be initialized with one of the following: * A literal, with numeric literals optionally preceded with a sign * A name of another constant * A list or map initializer Examples of constants initialized with literals: ```thrift const bool FLAG = true; const byte OFFSET = -10; // byte is an 8-bit signed integer const i16 COUNT = 200; const i32 MASK = 0xFA12EE; const double E = 2.718281828459; const string DATE = "June 28, 2017"; ``` Most initializers don't have exact types on their own. Instead their types are inferred from context. For example, `[1, 2, 3]` can be used to initialize a constant of type `list`, `list` or other compatible type: ```thrift const list BIG = [1, 2, 3]; const list LITTLE = [1, 2, 3]; ``` Initializers can have the following inferred types: | Initializer | Inferred Type | | ------------------- | -------------------------------------- | | `integer` | an integer type, a floating-point type | | `float` | a floating-point type | | `string_literal` | `string`, `binary` | | `bool_literal` | `bool` | | `map_initializer` | a map, struct or union type | The value of the initializer must be representable by the inferred type. For example: ```thrift const i16 LOWER = 10000; // OK const i16 UPPER = 100000; // error: 100000 is out of range for i16 ``` :::caution For legacy reasons some implementations support additional kinds of inference not listed in the table above. Using them is discouraged. ::: `maybe_qualified_id` represents a name referring to a constant and can be one of: * An identifier that denotes a constant defined in the same Thrift file. * A qualified name of the form `.` where `` and `` are identifiers, and `` denotes a constant defined in the Thrift file denoted by ``. * A qualified name of the form `.` where `` and `` are identifiers denoting an enum and an enumerator defined in it respectively. * A qualified name of the form `..` where ``, ``, and `` are identifiers. `` denotes an enum defined in the Thrift file denoted by `` and `` is the enumerator defined in this enum. ```thrift const i32 SEARCH_AGGREGATOR_PORT = PORT; const double EULERS_NUMBER = MathConstants.E; const SearchKind SEARCH_KIND = SearchKind.PAGES; const search_types.SearchKind SK = search_types.SearchKind.GROUPS; ``` #### List and Map Initializers ```grammar list_initializer ::= "[" [(initializer ",")* initializer [","]] "]" map_initializer ::= "{" [(map_entry ",")* map_entry [","]] "}" map_entry ::= initializer ":" initializer ``` Constants can also be of list, set, or map types. ``` list AList = [2, 3, 5, 7] set ASet = ["foo", "bar", "baz"] map> AMap = { "foo" : [1, 2, 3, 4], "bar" : [10, 32, 54], } ``` ## Types ```grammar type ::= primitive_type | container_type | maybe_qualified_id ``` Thrift supports primitive, container and named types. The name can be an identifier that denotes a type defined in the same Thrift file, or a qualified name of the form `filename.typename` where `filename` and `typename` are identifiers, and `typename` denotes a type defined in the Thrift file denoted by `filename`. ### Primitive Types ```grammar primitive_type ::= "bool" | "byte" | "i16" | "i32" | "i64" | "float" | "double" | "string" | "binary" ``` * `bool`: `true` or `false` * `byte`: an 8-bit signed integer * `i16`: a 16-bit signed integer * `i32`: a 32-bit signed integer * `i64`: a 64-bit signed integer * `float`: a 32-bit floating-point number * `double`: a 64-bit floating-point number * `string`: a UTF-8 string * `binary`: a byte array Integer types consist of `byte`, `i16`, `i32` and `i64`. Floating-point types consist of `float` and `double`. Thrift does not support unsigned integers because they have no direct translation to native types in some of Thrift’s target languages such as Hack and Java. `binary` and `string` are encoded identically in the Binary and Compact protocols used for RPC and are interchangeable. However, they are encoded differently in JSON protocols: `binary` is Base64-encoded while `string` only has special characters escaped. :::caution Some target languages enforce that `string` values are UTF-8 encoded and others do not. For example, Java and Python require valid UTF-8, while C++ does not. This can manifest itself as a cross-language incompatibility. ::: ### Container Types ```grammar container_type ::= list_type | set_type | map_type list_type ::= "list" "<" type ">" set_type ::= "set" "<" type ">" map_type ::= "map" "<" type "," type ">" ``` Thrift has strongly-typed containers that map to commonly used containers in target programming languages. There are three container types available: * `list`: A list of elements of type `T`. May contain duplicates. * `set`: An unordered set of unique elements of type `T`. * `map`: An unordered map of unique keys of type `K` to values of type `V`. :::note In some languages default mode is to use ordered sets and maps. This could be changed to use unordered and customized containers - see [Thrift Annotations](/idl/annotations.md#unstructured-annotations-deprecated). ::: :::caution Although not enforced, it is strongly encouraged to only use set and map when key is either a string or an integer type for the highest compatibility between languages. ::: The element, key, and value types can be any Thrift type, including nested containers. ## Default Values ``` default_value ::= "=" initializer ``` Every Thrift type has a _standard default value_: * bool: `false` * Integer types: 0 * enum: 0 (even if that enum has no enumerator with value zero) * float and double: +0.0 * string and binary: empty string * containers: empty container * structs: * each optional field is not present * each non-optional field is set to its custom default if an initializer was provided or (recursively) to the standard default value for its type. * unions: empty union An instance of a Thrift struct that is initialized without an explicit value should be initialized to its standard default value. :::note When initializing or deserializing a struct in Java, or in Hack without the `nonnullables` compiler option, unqualified fields that are not present / explicitly initialized will be set to null/None instead of their standard default values. ::: :::note The concepts "optional", "unqualified", and "not present" are described in more detail below. ::: ``` enum Foo { A = 1, B = 2, C = 3} struct Person { 1: i64 age; 2: string name; } struct Bar { 1: i64 field1 = 10; // default value is 10 2: i64 field2; // default value is 0 3: map field3 = {15 : 'a_value', 2: 'b_value'}; // default value is the map with two entries 4: list field4 = [Foo.A, Foo.B, Foo.A]; // default value is a list with three entries 5: Person field5 = Person{age = 40, name = "John"}; // default value is a struct with these default values 6: Foo field6; // default value is 0 (unnamed) } ``` ### Intrinsic Default Value Every Thrift type also has an _intrinsic default value_. The intrinsic default value of a type is the same as its standard default value except for structs, where custom field initializers are ignored. For example, the intrinsic default value of `struct Bar` above is: ``` Bar { field1 = 0, // vs. standard default value: 10 field2 = 0, // same as standard default value field3 = {}, // vs. {15 : 'a_value', 2: 'b_value'} field4 = [], // vs. [Foo.A, Foo.B, Foo.A] field5 = Person{age = 0, name = ""}, // vs. Person{age = 40, name = "John"} field6 = 0. // same } ``` :::note In practice, users are less likely to interact with intrinsic default values than the standard ones, but they are nonetheless relevant in specific use cases such as [Terse Fields](./#terse-fields). In C++, the `apache::thrift::clear()` function sets objects to their instrinsic default values. ::: :::caution Avoid using default values on optional fields. It is not possible to tell if the server sent the value, or if it is just the default value. ::: :::caution Do not change default values after setting them. It is hard to know what code may be relying on the previous default values. ::: ## Field Qualifiers ```grammar field_qualifier ::= "required" | "optional" ``` The fields of a Thrift object may assume any value of their types. In addition the field may also be "uninitialized" (formally "not present"). I.e., the *state* of a field is either *not present*, or *present* with a particular value of their type. There are four kinds of fields that may be part of a struct type (or exception type): * ~~required: Field is qualified with the reserved word `required`.~~ (deprecated) * *optional*: Field is qualified with the reserved word `optional`. * *terse*: Filed is annotated with the structured annotation `@thrift.TerseWrite`. * *unqualified*: Field is not qualified with either of these reserved words or `@thrift.TerseWrite`. :::note Union types may only have optional fields. ::: ``` include "thrift/annotation/thrift.thrift" struct PeopleSearchRequest { 1: required search_types.Query query; // required field 2: i32 numResults; // unqualified field 3: optional Location currentLocation; // optional field @thrift.TerseWrite 4: i32 count; // terse field } ``` These field kinds are different with respect to the states they may assume, and also with respect to their serialization and deserialization behavior. ### ~~Required Fields~~ :::caution Do not use in the new code! ::: ~~Required fields are always _present_ and they get initialized to their default value when the object is initialized (or reinitialized). Generated code must provide a way to change the value of required fields. When serializing an object, required fields are always encoded into the serialized data. The serialized data must provide a value during deserialization into a required field.~~ Presence of required fields is not being enforced anymore (Java still does, but that is being removed too). ### Optional Fields Optional fields start their lifecycle as *not present* (when the object is initialized or reinitialized). Generated code must provide a way to introduce the field (change state from *not present* to *present*) and to reinitialize the field (change state from *present* to *not present*). Generated code must also provide a way to change the value of optional fields. When serializing an object, optional fields are encoded into the serialized data if and only if they are *present*. During deserialization, the serialized data may or may not provide a value for an optional field. If a value is provided, optional field becomes *present* with that value, otherwise the optional field does not change its state. ### Unqualified Fields Unqualified fields are just like optional fields with the following difference: When an object with an unqualified field is serialized and the field is *not present*, the default value of the field is serialized as the value for that field. :::note Regarding unqualified vs. optional field, we don't have a generalized strong recommendation on which one should be used. It depends on individual use cases. If you need to know whether the field is explicitly set or not, use optional fields. ::: ### Terse Fields Terse fields are skipped during serialization if their values are equal to the [_intrinsic default values_](./#intrinsic-default-value) for their types. There is no difference between a terse field set to its intrinsic default and not set at all.