In Python, a _symbol_ is any name that is not a keyword. Symbols can represent classes, functions, methods, variables, parameters, modules, type aliases, type variables, etc.
Symbols are defined within _scopes_. A scope is associated with a block of code and defines which symbols are visible to that code block. Scopes can be “nested” allowing code to see symbols within its immediate scope and all “outer” scopes.
The following constructs within Python define a scope:
1. The “builtins” scope is always present and is always the outermost scope. It is pre-populated by the Python interpreter with symbols like “int” and “list”.
2. The module scope (sometimes called the “global” scope) is defined by the current source code file.
3. Each class defines its own scope. Symbols that represent methods, class variables, or instance variables appear within a class scope.
4. Each function and lambda defines its own scope. The function’s parameters are symbols within its scope, as are any variables defined within the function.
A symbol can be declared with an explicit type. The “def” and “class” keywords, for example, declare a symbol as a function or a class. Other symbols in Python can be introduced into a scope with no declared type. Newer versions of Python have introduced syntax for declaring the types of input parameters, return parameters, and variables.
When a parameter or variable is annotated with a type, the type checker verifies that all values assigned to that parameter or variable conform to that type.
Some languages require every symbol to be explicitly typed. Python allows a symbol to be bound to different values at runtime, so its type can change over time. A symbol’s type doesn’t need to be declared statically.
When Pyright encounters a symbol with no type declaration, it attempts to _infer_ the type based on the values assigned to it. As we will see below, type inference cannot always determine the correct (intended) type, so type annotations are still required in some cases. Furthermore, type inference can require significant computation, so it is much less efficient than when type annotations are provided.
If a symbol’s type cannot be inferred, Pyright sets its type to “Unknown”, which is a special form of “Any”. The “Unknown” type allows Pyright to optionally warn when types are not declared and cannot be inferred, thus leaving potential “blind spots” in type checking.
The simplest form of type inference is one that involves a single assignment to a symbol. The inferred type comes from the type of the source expression. Examples include:
var6 = [p for p in [1, 2, 3]] # Inferred type is List[int]
```
### Multi-Assignment Type Inference
When a symbol is assigned values in multiple places within the code, those values may have different types. The inferred type of the variable is the union of all such types.
In some cases, an expression’s type is ambiguous. For example, what is the type of the expression `[]`? Is it `List[None]`, `List[int]`, `List[Any]`, `Sequence[Any]`, `Iterable[Any]`? These ambiguities can lead to unintended type violations. Pyright uses several techniques for reducing these ambiguities based on contextual information. In the absence of contextual information, heuristics are used.
One powerful technique Pyright uses to eliminate type inference ambiguities is _bidirectional inference_. This technique makes use of an “expected type”.
As we saw above, the type of the expression `[]` is ambiguous, but if this expression is passed as an argument to a function, and the corresponding parameter is annotated with the type `List[int]`, Pyright can now assume that the type of `[]` in this context must be `List[int]`. Ambiguity eliminated!
This technique is called “bidirectional inference” because type inference for an assignment normally proceeds by first determining the type of the right-hand side (RHS) of the assignment, which then informs the type of the left-hand side (LHS) of the assignment. With bidirectional inference, if the LHS of an assignment has a declared type, it can influence the inferred type of the RHS.
It is common to initialize a local variable or instance variable to an empty list (`[]`) or empty dictionary (`{}`) on one code path but initialize it to a non-empty list or dictionary on other code paths. In such cases, Pyright will infer the type based on the non-empty list or dictionary and suppress errors about a “partially unknown type”.
As with variable assignments, function return types can be inferred from the `return` statements found within that function. The returned type is assumed to be the union of all types returned from all `return` statements. If a `return` statement is not followed by an expression, it is assumed to return `None`. Likewise, if the function does not end in a `return` statement, and the end of the function is reachable, an implicit `return None` is assumed.
# This function has two explicit return statements and one implicit
# return (at the end). It does not have a declared return type,
# so Pyright infers its return type based on the return expressions.
# In this case, the inferred return type is Union[str, bool, None].
def func1(val: int):
if val > 3:
return ""
elif val <1:
return True
```
### NoReturn return type
If there is no code path that returns from a function (e.g. all code paths raise an exception), Pyright infers a return type of `NoReturn`. As an exception to this rule, if the function is decorated with `@abstractmethod`, the return type is not inferred as `NoReturn` even if there is no return. This accommodates a common practice where an abstract method is implemented with a `raise NotImplementedError()` statement.
Pyright can infer the return type for a generator function from the `yield` statements contained within that function.
### Call-site Return Type Inference
It is common for input parameters to be unannotated. This can make it difficult for Pyright to infer the correct return type for a function. For example:
In cases where all parameters are unannotated, Pyright uses a technique called _call-site return type inference_. It performs type inference using the the types of arguments passed to the function in a call expression. If the unannotated function calls other functions, call-site return type inference can be used recursively. Pyright limits this recursion to a small number for practical performance reasons.
Input parameters for functions and methods typically require type annotations. There are several cases where Pyright may be able to infer a parameter’s type if it is unannotated.
For instance methods, the first parameter (named `self` by convention) is inferred to be type `Self`.
For class methods, the first parameter (named `cls` by convention) is inferred to be type `type[Self]`.
For other unannotated parameters within a method, Pyright looks for a method of the same name implemented in a base class. If the corresponding method in the base class has the same signature (the same number of parameters with the same names), no overloads, and annotated parameter types, the type annotation from this method is “inherited” for the corresponding parameter in the child class method.
When parameter types are inherited from a base class method, the return type is not inherited. Instead, normal return type inference techniques are used.
If the type of an unannotated parameter cannot be inferred using any of the above techniques and the parameter has a default argument expression associated with it, the parameter type is inferred from the default argument type. If the default argument is `None`, the inferred type is `Unknown | None`.
Python 3.8 introduced support for _literal types_. This allows a type checker like Pyright to track specific literal values of str, bytes, int, bool, and enum values. As with other types, literal types can be declared.
When inferring the type of a tuple expression (in the absence of bidirectional inference hints), Pyright assumes that the tuple has a fixed length, and each tuple element is typed as specifically as possible.
2. If the list contains at least one element and all elements are the same type T, infer the type `List[T]`.
3. If the list contains multiple elements that are of different types, the behavior depends on the `strictListInference` configuration setting. By default this setting is off.
* If `strictListInference` is off, infer `List[Unknown]`.
These heuristics can be overridden through the use of bidirectional inference hints (e.g. by providing a declared type for the target of the assignment expression).
When inferring the type of a set expression (in the absence of bidirectional inference hints), Pyright uses the following heuristics:
1. If the set contains at least one element and all elements are the same type T, infer the type `Set[T]`.
2. If the set contains multiple elements that are of different types, the behavior depends on the `strictSetInference` configuration setting. By default this setting is off.
* If `strictSetInference` is off, infer `Set[Unknown]`.
* Otherwise use the union of all element types and infer `Set[Union[(elements)]]`.
These heuristics can be overridden through the use of bidirectional inference hints (e.g. by providing a declared type for the target of the assignment expression).
```python
var1 = {1, 2} # Infer Set[int]
# Type depends on strictSetInference config setting
When inferring the type of a dictionary expression (in the absence of bidirectional inference hints), Pyright uses the following heuristics:
1. If the dict is empty (`{}`), assume `Dict[Unknown, Unknown]`.
2. If the dict contains at least one element and all keys are the same type K and all values are the same type V, infer the type `Dict[K, V]`.
3. If the dict contains multiple elements where the keys or values differ in type, the behavior depends on the `strictDictionaryInference` configuration setting. By default this setting is off.
* If `strictDictionaryInference` is off, infer `Dict[Unknown, Unknown]`.
* Otherwise use the union of all key and value types `Dict[Union[(keys), Union[(values)]]]`.
Lambdas present a particular challenge for a Python type checker because there is no provision in the Python syntax for annotating the types of a lambda’s input parameters. The types of these parameters must therefore be inferred based on context using bidirectional type inference. Absent this context, a lambda’s input parameters (and often its return type) will be unknown.