Added documentation that covers the major behavioral differences between pyright and mypy and the justifications for those differences.

This commit is contained in:
Eric Traut 2022-12-18 15:11:11 -08:00
parent c4e61dcc6a
commit 5a19cb9f0a
2 changed files with 414 additions and 0 deletions

View File

@ -113,6 +113,7 @@ To update to the latest version:
* [Settings](/docs/settings.md)
* [Comments](/docs/comments.md)
* [Type Inference](/docs/type-inference.md)
* [Differences from Mypy](/docs/mypy-comparison.md)
* [Import Resolution](/docs/import-resolution.md)
* [Extending Builtins](/docs/builtins.md)
* [Type Stubs](/docs/type-stubs.md)

413
docs/mypy-comparison.md Normal file
View File

@ -0,0 +1,413 @@
# Differences Between Pyright and Mypy
## What is Mypy?
Mypy is the “OG” in the world of Python type checkers. It was started by Jukka Lehtosalo in 2012 with contributions from Guido van Rossum, Ivan Levkivskyi, and many others over the years. For a detailed history, refer to [this documentation](http://mypy-lang.org/about.html). The code for mypy can be found in [this github project](https://github.com/python/mypy).
## Why Does Pyrights Behavior Differ from Mypys?
Mypy served as a reference implementation of [PEP 484](https://www.python.org/dev/peps/pep-0484/), which defines standard behaviors for Python static typing. Although PEP 484 spells out many type checking behaviors, it intentionally leaves many other behaviors undefined. This approach has allowed different type checkers to innovate and differentiate.
Pyright generally adheres to the type checking behaviors spelled out in PEP 484 and follow-on typing PEPs (526, 544, 586, 589, etc.). For behaviors that are not explicitly spelled out in these standards, pyright generally tries to adhere to mypys behavior unless there is a compelling justification for deviating. This document discusses these differences and provides the reasoning behind each design choice.
## Design Goals
Pyright was designed with performance in mind. It is not unusual for pyright to be 3x to 5x faster than mypy when type checking large code bases. Some of its design decisions were motivated by this goal.
Pyright was also designed to be used as the foundation for a Python [language server](https://microsoft.github.io/language-server-protocol/). Language servers provide interactive programming features such as completion suggestions, function signature help, type information on hover, semantic-aware search, semantic-aware renaming, semantic token coloring, refactoring tools, etc. For a good user experience, these features require highly responsive type evaluation performance during interactive code modification. They also require type evaluation to work on code that is incomplete and contains syntax errors.
To achieve these design goals, pyright is implemented as a “lazy” or “just-in-time” type evaluator. Rather than analyzing all code in a module from top to bottom, it is able to evaluate the type of an arbitrary identifier anywhere within a module. If the type of that identifier depends on the types of other expressions or symbols, pyright recursively evaluates those in turn until it has enough information to determine the type of the requested identifier. By comparison, mypy uses a more traditional multi-pass architecture where semantic analysis is performed multiple times on a module from the top to the bottom until all types converge.
Pyright implements its own parser, which recovers gracefully from syntax errors and continues parsing the remainder of the source file. By comparison, mypy uses the parser built in to the Python interpreter, and it does not support recovery after a syntax error.
## Type Checking Unannotated Code
By default, pyright performs type checking for all code regardless of whether it contains type annotations. This is important for language server features. It is also important for catching bugs in code that is unannotated.
By default, mypy skips all functions or methods that do not have type annotations. This is a common source of confusion for mypy users who are surprised when type violations in unannotated functions go unreported. If the option `--check-untyped-defs` is enabled, mypy performs type checking for all functions and methods.
## Inferred Return Types
If a function or method lacks a return type annotation, pyright infers the return type from `return` and `yield` statements within the functions body (including the implied `return None` at the end of the function body). This is important for supporting completion suggestions. It also improves type checking coverage and eliminates the need for developers to needlessly supply return type annotations for trivial return types.
By comparison, mypy never infers return types and assumes that functions without a return type annotation have a return type of `Any`. This was an intentional design decision by mypy developers and is explained in [this thread](https://github.com/python/mypy/issues/10149).
## Unions vs Joins
When merging two types during code flow analysis or widening types during constraint solving, pyright always uses a union operation. Mypy typically (but not always) uses a “join” operation, which merges types by finding a common supertype. The use of joins discards valuable type information and leads to many false positive errors that are [well documented within the mypy issue tracker](https://github.com/python/mypy/issues?q=is%3Aissue+is%3Aopen+label%3Atopic-join-v-union).
```python
def func1(val: object):
if isinstance(val, str):
pass
elif isinstance(val, int):
pass
else:
return
reveal_type(val) # mypy: object, pyright: str | int
def func2(condition: bool, val1: str, val2: int):
x = val1 if condition else val2
reveal_type(x) # mypy: object, pyright: str | int
y = val1 or val2
# In this case, mypy uses a union instead of a join
reveal_type(y) # mypy: str | int, pyright: str | int
```
## Variable Type Declarations
Pyright treats variable type annotations as type declarations. If a variable is not annotated, pyright allows any value to be assigned to that variable, and its type is inferred to be the union of all assigned types.
Mypys behavior for variables depends on whether the [`--allow-redefinition`](https://mypy.readthedocs.io/en/stable/command_line.html#cmdoption-mypy-allow-redefinition) is specified. If redefinitions are not allowed, then mypy typically treats the first assignment (the one with the smallest line number) as though it is an implicit type declaration.
```python
def func1(condition: bool):
if condition:
x = 3 # Mypy treats this as an implicit type declaration
else:
x = "" # Mypy treats this as an error because `x` is implicitly declared as `int`
def func2(condition: bool):
x = None # Mypy provides some exceptions; this is not considered an implicit type declaration
if condition:
x = "" # This is not considered an error
def func3(condition: bool):
x = [] # Mypy doesn't treat this as a declaration
if condition:
x = [1, 2, 3] # The type of `x` is declared as `list[int]`
```
Pyrights behavior is more consistent, is conceptually simpler and more natural for Python developers, leads to fewer false positives, and eliminates the need for many otherwise-necessary variable type annotations.
## Class and Instance Variable Inference
Pyright handles instance and class variables consistently with local variables. If a type annotation is provided for an instance or class variable (either within the class or one of its base classes), pyright treats this as a type declaration and enforces it accordingly. If a class implementation does not provide a type annotation for an instance or class variable and its base classes likewise do not provide a type annotation, the variables type is inferred from all assignments within the class implementation.
```python
class A:
def method1(self) -> None:
self.x = 1
def method2(self) -> None:
self.x = "" # Mypy treats this as an error because `x` is implicitly declared as `int`
a = A()
reveal_type(a.x) # pyright: int | str
a.x = "" # Pyright allows this because the type of `x` is `int | str`
a.x = 3.0 # Pyright treats this as an error because the type of `x` is `int | str`
```
## Class and Instance Variable Enforcement
Pyright distinguishes between “pure class variables”, “regular class variables”, and “pure instance variable”. For a detailed explanation, refer to [this documentation](https://github.com/microsoft/pyright/blob/main/docs/type-concepts.md#class-and-instance-variables).
Mypy does not distinguish between class variables and instance variables in all cases. This is a [known issue](https://github.com/python/mypy/issues/240).
```python
class A:
x: int = 0 # Regular class variable
y: ClassVar[int] = 0 # Pure class variable
def __init__(self):
self.z = 0 # Pure instance variable
print(A.x)
print(A.y)
print(A.z) # pyright: error, mypy: no error
```
## Assignment-based Type Narrowing
Pyright applies type narrowing for variable assignments. This is done regardless of whether the assignment statement includes a variable type annotation. Mypy skips assignment-based type narrowing when the target variable includes a type annotation. The consensus of the typing community is that mypys behavior here is inconsistent, and there are [plans to eliminate this inconsistency](https://github.com/python/mypy/issues/2008).
```python
v1: Sequence[int]
v1 = [1, 2, 3]
reveal_type(v1) # mypy and pyright both reveal `list[int]`
v2: Sequence[int] = [1, 2, 3]
reveal_type(v2) # mypy reveals `Sequence[int]` rather than `list[int]`
```
## Type Guards
Pyright supports several built-in type guards that mypy does not currently support. For a full list of type guard expression forms supported by pyright, refer to [this documentation](https://github.com/microsoft/pyright/blob/main/docs/type-concepts.md#type-guards).
The following expression forms are not currently supported by mypy as type guards:
* `x == L` and `x != L` (where L is an expression with a literal type)
* `len(x) == L` and `len(x) != L` (where x is tuple and L is a literal integer)
* `x in y` or `x not in y` (where y is instance of list, set, frozenset, deque, tuple, dict, defaultdict, or OrderedDict)
* `S in D` and `S not in D` (where S is a string literal and D is a final TypedDict)
* `bool(x)` (where x is any expression that is statically verifiable to be truthy or falsy in all cases)
## Aliased Conditional Expressions
Pyright supports the [aliasing of conditional expressions](https://github.com/microsoft/pyright/blob/main/docs/type-concepts.md#aliased-conditional-expression) used for type guards. Mypy does not currently support this, but it is a frequently-requested feature.
## Narrowing for Implied Else
Pyright supports a feature called [type narrowing for implied else](https://github.com/microsoft/pyright/blob/main/docs/type-concepts.md#narrowing-for-implied-else) in cases where an `if` or `elif` clause has no associated `else` clause. This feature allows pyright to determine that all cases have already been handled by the `if` or `elif` statement and that the "implied else" would never be executed if it were present. This eliminates certain false positive errors. Mypy currently does not support this.
```python
class Color(Enum):
RED = 1
BLUE = 2
def is_red(color: Color) -> bool:
if color == Color.RED:
return True
elif color == Color.BLUE:
return False
# mypy reports error: Missing return statement
```
## Narrowing Any
Pyright never narrows `Any` when performing type narrowing for assignments. Mypy is inconsistent about when it applies type narrowing to `Any` type arguments.
```python
b: list[Any]
b = [1, 2, 3]
reveal_type(b) # mypy: list[Any]
c = [1, 2, 3]
b = c
reveal_type(b) # mypy: list[int]
```
## Inference of List, Set, and Dict Expressions
Pyrights inference rules for [list, set and dict expressions](https://github.com/microsoft/pyright/blob/main/docs/type-inference.md#list-expressions) differ from mypys when heterogeneous entry types are used. Mypy uses a join operator to combine the types. Pyright uses either an `Unknown` or a union depending on configuration settings. A join operator often produces a type that is not what was intended, and this leads to false positive errors.
```python
x = [1, 3.4, ""]
reveal_type(x) # mypy: list[object], pyright: list[Unknown] or list[int | float | str]
```
For these mutable container types, pyright does not retain literal types when inferring the container type. Mypy is inconsistent, sometimes retaining literal types and sometimes not.
```python
def func(one: Literal[1]):
reveal_type(one) # Literal[1]
reveal_type([one]) # pyright: list[int], mypy: list[Literal[1]]
reveal_type(1) # Literal[1]
reveal_type([1]) # pyright: list[int], mypy: list[int]
```
## Inference of Tuple Expressions
Pyrights inference rules for [tuple expressions](https://github.com/microsoft/pyright/blob/main/docs/type-inference.md#tuple-expressions) differ from mypys when tuple entries contain literals. Pyright retains these literal types, but mypy widens the types to their non-literal type. Pyright retains the literal types in this case because tuples are immutable, and more precise (narrower) types are almost always beneficial in this situation.
```python
x = (1, "stop")
reveal_type(x[1]) # pyright: Literal["stop"], mypy: str
y: Literal["stop", "go"] = x[1] # mypy: type error
```
## Assignment-Based Narrowing for Literals
When assigning a literal value to a variable, pyright narrows the type to include the literal. Mypy does not. Pyright retains the literal types in this case because more precise (narrower) types are typically beneficial and have little or no downside.
```python
x: str | None
x = 'a'
reveal_type(x) # pyright: Literal['a'], mypy: str
```
Pyright also supports “literal math” for simple operations between literals.
```python
def func1(a: Literal[1, 2], b: Literal[2, 3]):
c = a + b
reveal_type(c) # Literal[3, 4, 5]
def func2():
c = "hi" + " there"
reveal_type(c) # Literal['hi there']
```
## Type Narrowing for Asymmetric Descriptors
When pyright handles a write to a class variable that contains a descriptor object (including properties), it normally applies assignment-based type narrowing. However, when the descriptor is asymmetric — that is, its “getter” type is different from its “setter” type, pyright refrains from applying assignment-based type narrowing. For a full discussion of this, refer to [this issue])(https://github.com/python/mypy/issues/3004). Mypy has not yet implemented the agreed-upon behavior, so its type narrowing behavior may differ from pyrights in this case.
## Parameter Type Inference
Mypy doesnt infer types of function parameters beyond `self` and `cls` parameters in methods.
Pyright implements several parameter type inference techniques that improve type checking and language service features in the absence of explicit parameter type annotations. For details, refer to [this documentation](https://github.com/microsoft/pyright/blob/main/docs/type-inference.md#parameter-type-inference).
## Constraint Solving
When evaluating a call expression that invokes a generic class constructor or a generic function, a type checker performs a process called “constraint solving” to solve the type variables found within the target function signature. The solved type variables are then applied to the return type of that function to determine the final type of the call expression. This process is called “constraint solving” because it takes into account various constraints that are specified for each type variable. These constraints include variance rules and type variable bounds.
Many aspects of constraint solving are not specified in PEP 484. This includes behaviors around literals, whether to use unions or joins to widen types, and how to handle cases where multiple types could satisfy all type constraints.
### Literals
Pyrights constraint solver retains literal types only when they are required to satisfy constraints. In other cases, it widens the type to a non-literal type. Mypy is inconsistent in its handling of literal types.
```python
def func(one: Literal[1]):
reveal_type(one) # Literal[1]
v1 = identity(one)
reveal_type(v1) # pyright: int, mypy: Literal[1]
reveal_type(1) # Literal[1]
v2 = identity(1)
reveal_type(v2) # pyright: int, mypy: int
```
### Unions vs Joins
As mentioned previously, pyright always uses unions rather than joins. Mypy typically uses joins.
```python
T = TypeVar("T")
def func3(val1: T, val2: T) -> T:
...
reveal_type(func3("", 1)) # mypy: object, pyright: str | int
```
### Ambiguous Solution Scoring
In cases where more than one solution is possible for a type variable, both pyright and mypy employ various heuristics to pick the “best” solution. These heuristics are complex and difficult to document in their fullness. Pyrights general strategy is to return the “simplest” type that meets the constraints.
Consider the expression `make_list(x)` in the example below. The type constraints for `T` could be satisfied with either `int` or `list[int]`, but its much more likely that the developer intended the former (simpler) solution. Pyright calculates all possible solutions and “scores” them according to complexity, then picks the type with the best score. In rare cases, there can be two results with the same score, in which chase pyright arbitrarily picks one as the winner.
Mypy produces errors with this sample.
```python
T = TypeVar("T")
def make_list(x: T | Iterable[T]) -> list[T]:
return list(x) if isinstance(x, Iterable) else [x]
def func2(x: list[int], y: list[str] | int):
v1 = make_list(x)
reveal_type(v1) # pyright: `list[int]` (`list[list[T]]` is also a valid answer)
v2 = make_list(y)
reveal_type(v2) # pyright: `list[int | str]` (`list[list[str] | int]` is also a valid answer)
```
## Constrained Type Variables
When mypy analyzes a class or function that has in-scope constrained TypeVars, it analyzes the class or function multiple times, once for each constraint. This can produce multiple errors.
```python
T = TypeVar("T", list[Any], set[Any])
def func(a: AnyStr, b: T):
reveal_type(a) # Mypy reveals 2 different types (`str` and `bytes`), pyright reveals `AnyStr`
return a + b # Mypy reports 4 errors
```
Pyright cannot use the multi-pass technique as mypy in this case. It needs to produce a single type for any given identifier to support language server features. Pyright instead uses a mechanism called [conditional types](https://github.com/microsoft/pyright/blob/main/docs/type-concepts.md#constrained-type-variables-and-conditional-types). This approach allows pyright to handle some constrained TypeVar use cases that mypy cannot, but there are conversely other use cases that mypy can handle and pyright cannot.
## “Unknown” Type and Strict Mode
Pyright differentiates between explicit and implicit forms of `Any`. The implicit form is referred to as [`Unknown`]https://github.com/microsoft/pyright/blob/main/docs/type-inference.md#unknown-type. For example, if a parameter is annotated as `list[Any]`, that is a use of an explicit `Any`, but if a parameter is annotated as `list`, that is an implicit `Any`, so pyright refers to this type as `list[Unknown]`. Pyright implements several checks that are enabled in “strict” type-checking modes that report the use of an `Unknown` type. Such uses can mask type errors.
Mypy does not track the difference between explicit and implicit `Any` types, but it supports various checks that report the use of values whose type is `Any`: `--warn-return-any` and `--disallow-any-*`. For details, refer to [this documentation](https://mypy.readthedocs.io/en/stable/command_line.html#disallow-dynamic-typing).
Pyrights approach gives developers more control. It provides a way to be explicit about `Any` where that is the intent. When an `Any` is implicitly produced due to an missing type argument or some other condition that produces an `Any` within the type checker logic, the developer is alerted to that condition.
## Overload Resolution
Overload resolution rules are under-specified in PEP 484. Pyright and mypy apply similar rules, but there are likely some complex edge cases where different results will be produced. For full documentation of pyrights overload behaviors, refer to [this documentation](https://github.com/microsoft/pyright/blob/main/docs/type-concepts.md#overloads).
## Import Statements
Pyright intentionally does not model any implicit side effects of the Python import loading mechanism. In general, such side effects cannot be modeled statically because they depend on execution order. Dependency on such side effects leads to fragile code, so pyright treats these as errors.
Mypy models some side effects of the import loader. If an import statement imports a submodule using a multi-part module reference, mypy assumes that all of the parent modules are also initialized and cached such they can be referenced.
```python
import collections.abc
collections.deque() # Pyright produces an error here because the `collections` module wasn't explicitly imported
```
## Circular References
Because mypy is a multi-pass analyzer, it is able to deal with certain forms of circular references that pyright cannot handle. Here are several examples of circularities that mypy resolves without errors but pyright does not.
1. A class declaration that references a metaclass whose declaration depends on the class.
```python
T = TypeVar("T")
class MetaA(type, Generic[T]): ...
class A(metaclass=MetaA["A"]): ...
```
2. A class declaration that uses a TypeVar whose bound or constraint depends on the class.
```python
T = TypeVar("T", bound="A")
class A(Generic[T]): ...
```
3. A class that is decorated with a class decorator that uses the class in the decorators own signature.
```python
def my_decorator(x: Callable[..., "A"]) -> Callable[..., "A"]:
return x
@my_decorator
class A: ...
```
## Class Decorator Evaluation
Pyright honors class decorators. Mypy largely ignores them. See [this issue](https://github.com/python/mypy/issues/3135) for details.
## Support for Type Comments
Versions of Python prior to 3.5 did not have a dedicated syntax for supplying type annotations. Annotations therefore needed to be supplied using “type comments” of the form `# type: <annotation>`. Python 3.6 added the ability to supply type annotations for variables.
Mypy has full support for type comments. Pyright supports type comments only in locations where there is a way to provide an annotation using modern syntax. Pyright was written to assume Python 3.5 and newer, so support for older versions was not a priority.
```python
# The following type comment is supported by
# mypy but is rejected by pyright.
x, y = (3, 4) # type: (float, float)
# Using Python syntax from Python 3.6, this
# would be annotated as follows:
x: float
y: float
x, y = (3, 4)
```