Commit Graph

212 Commits

Author SHA1 Message Date
GregoryTravis
dcfbf841b3
Replace Table should_equal with should_equal_verbose (#6405)
Table.should_equal_verbose shows the contents of the tables on failure; let's make this the standard comparison.
2023-04-26 14:01:49 +00:00
GregoryTravis
afd804d529
5127 Add Table.parse_to_columns to parse a single column to a set of columns. (#6383)
Implement Table.parse_to_columns
2023-04-24 15:21:38 +00:00
Radosław Waśko
f3873f9768
Infer SQLite types locally (#6381)
Closes #6208
2023-04-24 10:55:12 +00:00
Radosław Waśko
a43d524336
Add typechecks to Aggregate and Cross Tab (#6380)
Follow up of #6298 as it grew too much. Adds the needed typechecks to aggregate operations. Ensures that the DB operations report `Floating_Point_Equality` warning consistently with in-memory.
2023-04-24 08:55:54 +00:00
GregoryTravis
22f820feb7
Add Table.parse_text_to_table to convert Text to a Table. (#6294) 2023-04-21 17:43:19 +00:00
Radosław Waśko
8db2ad51a1
Adding typechecks to Column Operations (#6298)
Closes #6106
2023-04-21 12:20:12 +00:00
James Dunkerley
0350762386
Add replace, trim to Column. Better number parsing. (#6253)
- Add `replace` with same syntax as on `Text` to an in-memory `Column`.
- Add `trim` with same syntax as on `Text` to an in-memory `Column`.
- Add `trim` to in-database `Column`.
- Added `is_supported` to dialects and exposed the dialect consistently on the `Connection`.
- Add `write_table` support to `JSON_File` allowing `Table.write` to write JSON.
- Updated the parsing for integers and decimals:
- Support for currency symbols.
- Support for brackets for negative numbers.
- Automatic detection of decimal points and thousand separators.
- Tighter rules for scientific and thousand separated numbers.
- Remove `replace_text` from `Table`.
- Remove `write_json` from `Table`.
2023-04-20 16:04:59 +00:00
Pavel Marek
b42e910280
sort handles incomparable values (#5998)
* Update type ascriptions in some operators in Any

* Add @GenerateUncached to AnyToTextNode.

Will be used in another node with @GenerateUncached.

* Add tests for "sort handles incomparable types"

* Vector.sort handles incomparable types

* Implement sort handling for different comparators

* Comparison operators in Any do not throw Type_Error

* Fix some issues in Ordering_Spec

* Remove the remaining comparison operator overrides for numbers.

* Consolidate all sorting functionality into a single builtin node.

* Fix warnings attachment in sort

* PrimitiveValuesComparator handles other types than primitives

* Fix byFunc calling

* on function can be called from the builtin

* Fix build of native image

* Update changelog

* Add VectorSortTest

* Builtin method should not throw DataflowError.

If yes, the message is discarded (a bug?)

* TypeOfNode may not return only Type

* UnresolvedSymbol is not supported as `on` argument to Vector.sort_builtin

* Fix docs

* Fix bigint spec in LessThanNode

* Small fixes

* Small fixes

* Nothings and Nans are sorted at the end of default comparator group.

But not at the whole end of the resulting vector.

* Fix checking of `by` parameter - now accepts functions with default arguments.

* Fix changelog formatting

* Fix imports in DebuggingEnsoTest

* Remove Array.sort_builtin

* Add comparison operators to micro-distribution

* Remove Array.sort_builtin

* Replace Incomparable_Values by Type_Error in some tests

* Add on_incomparable argument to Vector.sort_builtin

* Fix after merge - Array.sort delegates to Vector.sort

* Add more tests for problem_behavior on Vector.sort

* SortVectorNode throws only Incomparable_Values.

* Delete Collections helper class

* Add test for expected failure for custom incomparable values

* Cosmetics.

* Fix test expecting different comparators warning

* isNothing is checked via interop

* Remove TruffleLogger from SortVectorNode

* Small review refactorings

* Revert "Remove the remaining comparison operator overrides for numbers."

This reverts commit 0df66b1080.

* Improve bench_download.py tool's `--compare` functionality.

- Output table is sorted by benchmark labels.
- Do not fail when there are different benchmark labels in both runs.

* Wrap potential interop values with `HostValueToEnsoNode`

* Use alter function in Vector_Spec

* Update docs

* Invalid comparison throws Incomparable_Values rather than Type_Error

* Number comparison builtin methods return Nothing in case of incomparables
2023-04-16 16:40:12 +02:00
GregoryTravis
4dcf5faddd
Add split and tokenize to the Table. (#6233)
Implement split and tokenize for tables.
2023-04-14 16:03:02 +00:00
Radosław Waśko
0f4f8a0542
Full-joins in SQLite (#6215)
Closes #5254

In #6189 the SQLite version was bumped to a newer release which has builtin support for Full and Right joins, so no workaround is no longer needed.
2023-04-06 16:49:14 +00:00
Radosław Waśko
f5db35af07
Adjust {Table|Column}.parse to use Value_Type (#6213)
Closes #5660
2023-04-06 10:58:55 +00:00
Radosław Waśko
83b10a2088
Implement Table.union for Database backend (#6204)
Closes #5235
2023-04-06 08:40:34 +00:00
James Dunkerley
f26bcf6ab6
Small issues from working with Ned (#6160)
- `Process.run` now returns a `Process_Result` allowing the easy capture of stdout and stderr.
- Joining a column with a column name does not warn if adding just the prefix.
- Stop the table viz from changing case and adding spaces to the headers.
2023-04-03 13:01:42 +00:00
Radosław Waśko
6ddcb553e5
Date/time support for Postgres. Year/month/day operations on Columns. (#6153)
Closes #6115
2023-03-31 18:37:04 +00:00
Radosław Waśko
6f86115498
Proper implementation of Value Types in Table (#6073)
This is the first part of the #5158 umbrella task. It closes #5158, follow-up tasks are listed as a comment in the issue.

- Updates all prototype methods dealing with `Value_Type` with a proper implementation.
- Adds a more precise mapping from in-memory storage to `Value_Type`.
- Adds a dialect-dependent mapping between `SQL_Type` and `Value_Type`.
- Removes obsolete methods and constants on `SQL_Type` that were not portable.
- Ensures that in the Database backend, operation results are computed based on what the Database is meaning to return (by asking the Database about expected types of each operation).
- But also ensures that the result types are sane.
- While SQLite does not officially support a BOOLEAN affinity, we add a set of type overrides to our operations to ensure that Boolean operations will return Boolean values and will not be changed to integers as SQLite would suggest.
- Some methods in SQLite fallback to a NUMERIC affinity unnecessarily, so stuff like `max(text, text)` will keep the `text` type instead of falling back to numeric as SQLite would suggest.
- Adds ability to use custom fetch / builder logic for various types, so that we can support vendor specific types (for example, Postgres dates).

# Important Notes
- There are some TODOs left in the code. I'm still aligning follow-up tasks - once done I will try to add references to relevant tasks in them.
2023-03-31 16:16:18 +00:00
GregoryTravis
6b9cbeacb2
Implement Regular Expression replace and update Text.replace to the new API (#5959)
Re-implement replace on top of Truffle regex.
2023-03-28 06:13:12 +00:00
James Dunkerley
58f2c7643f
Use new Enso Hash Codes and Comparable (#6060)
Enables `distinct`, `aggregate` and `cross_tab` to use the Enso hashing and equality operations.
Also, I rewired the way the ObjectComparators are obtained in polyglot code to be more consistent.

Add Comparator for `Day_Of_Week`, `Header`, `SQL_Type`, `Image` and `Matrix`.
Also, removed the custom `==` from these types as needed. (Closes #5626)
2023-03-24 15:02:25 +00:00
Hubert Plociniczak
8c6fd60aaf
Detect conflicts between exported types and FQNs (#5986)
Exporting types named the same as the module where they are defined in `Main` modules of library components may lead to accidental name conflicts. This became apparent when trying to access `Problem_Behavior` module via a fully qualified name and the compiler rejected it. This is due to the fact that `Main` module exported `Error` type defined in `Standard.Base.Error` module, thus making it impossible to access any other submodules of `Standard.Base.Error` via a fully qualified name.

This change adds a warning to FullyQualifiedNames pass that detects any such future problems.
While only `Error` module was affected, it was widely used in the stdlib, hence the number of changes.

Closes #5902.

# Important Notes
I left out the potential conflict in micro-distribution, thus ensuring we actually detect and report the warning.
2023-03-21 21:09:41 +00:00
Radosław Waśko
952beba8d1
Fix cross_tab column naming edge cases, add fill_empty (#5863)
Closes #5151 and adds some additional tests for `cross_tab` that verify duplicated and invalid names.

I decided that for empty or `Nothing` names, instead of replacing them with `Column` and implicitly losing connection with the value that was in the column, we should just error on such values.

To make handling of these easier, `fill_empty` was added allowing to easily replace the empty values with something else.

Also, `{is,fill}_missing` was renamed to `{is,fill}_nothing` to align with `Filter_Condition.Is_Nothing`.
2023-03-11 11:58:54 +00:00
Pavel Marek
5f7a4a5a39
Merge ordered and unordered comparators (#5845)
Merge _ordered_ and _unordered_ comparators into a single one.

# Important Notes
Comparator is now required to have only `compare` method:
```
type Comparator
comapre : T -> T -> (Ordering|Nothing)
hash : T -> Integer
```
2023-03-11 05:43:22 +00:00
Radosław Waśko
91ef8acf35
Review generated Column names (#5850)
Closes #5583 and closes #5157
2023-03-10 19:07:58 +00:00
Radosław Waśko
62e57f5557
Test some Mismatched Quote edge cases in Delimited reader (#5810)
Follow-up to #5113 - I add some more edge case tests as we discussed with @jdunkerley

When debugging some quoting issues, I also realised the current `Mismatched_Quote` error provided not enough information. So I amended it to at least include some context indicating which was the 'offending' cell.
2023-03-10 15:47:57 +00:00
Jaroslav Tulach
8bbdd1af5b
Meta.is_a consistent with case-type-of check (#5853)
Removing special handling of `AtomConstructor` in `Meta.is_a` check.

# Important Notes
A lot of tests are about to fail. Many of them indirectly call `Meta.is_a` with a constructor rather than type.
2023-03-10 07:41:04 +00:00
James Dunkerley
299bfd6b7d
Fixes from the Demo on 2nd March (#5823)
- Fix issue with Geo Map viz.
- Handle invalid format strings better in `Data_Formatter`.
- New constants for the ISO format strings (and a special ENSO_ZONED_DATE_TIME)
- Consistent Date Time format for parsing in all places.
- Avoid throwing exception in datetime parsing.
- Support for milliseconds (well nanoseconds) in Date_Time and Time_Of_Day.
- `Column.map` stays within Enso.
- Allow `Aggregate_Column.Group_By` in `cross_tab` group_by parameter.
2023-03-07 20:58:00 +00:00
Pavel Marek
b6e2319fcc
Comparators support partial ordering (#5778) 2023-03-07 04:16:38 +00:00
Radosław Waśko
da760aa27d
Review Text/Table.write problem behavior (#5816)
Closes #5114

Added tests for various problems scenarios when writing files.

And ensured that those tests are passing by fixing a few edge cases.
2023-03-07 02:25:13 +00:00
Radosław Waśko
2d29456ed1
Review File/Data read and read_text warnings (#5799)
Closes #5113

Fixes a bug where read-only files would be overwritten if File.write was used in backup mode, and added tests to avoid such regression. To implement it, introduced a `is_writable` property on `File`.
2023-03-06 03:43:38 +00:00
James Dunkerley
01fc34c18a
Improving Expression Support for In Database (#5790)
- Adjust Excel Workbook write behaviour.
- Support Nothing / Null constants.
- Deduce the type of arithmetic operations and `iif`.
- Allow Date_Time constants, treating as local timezone.
- Removed the `to_column_name` and `ensure_sane_name` code.
2023-03-03 12:03:05 +00:00
Radosław Waśko
b764b0b7b7
Improve error handling of Connection.query (#5693)
Closes #5252
2023-02-24 17:15:10 +00:00
Radosław Waśko
793eafc866
Improve Table.parse_values API (#5692)
Closes #5111
2023-02-24 13:35:01 +00:00
James Dunkerley
652b8d5db3
Update rename_columns to new API design, add first_row, second_row and last_row functions to the table. (#5719)
- Updates the `rename_columns` API.
- Add `first_row`, `second_row` and `last_row` to the Table types.
- New option for reading only last row of ResultSet.
2023-02-23 19:42:45 +00:00
Radosław Waśko
3027c6f3a2
Ensure entries containing newlines are quoted when writing Delimited Files (#5652)
Fixes #5638
2023-02-17 00:57:48 +00:00
James Dunkerley
1bc27501e6
Remove Column type from Aggregate_Column, simplify Column_Selector, some new File_Formats (#5646)
- Updated `Widget.Vector_Editor` ready for use by IDE team.
- Added `get` to `Row` to make API more aligned.
- Added `first_column`, `second_column` and `last_column` to `Table` APIs.
- Adjusted `Column_Selector` and associated methods to have simpler API.
- Removed `Column` from `Aggregate_Column` constructors.
- Added new `Excel_Workbook` type and added to `Excel_Section`.
- Added new `SQLiteFormatSPI` and `SQLite_Format`.
- Added new `IamgeFormatSPI` and `Image_Format`.
2023-02-16 15:15:49 +00:00
Radosław Waśko
a02eab451e
Implement basic warnings for column arithmetic, review warnings on expressions and filter (#5605)
Closes #5109

# Important Notes
- Currently the tests pass for the in-memory parts of Common_Table_Operations, but still some stuff not working on DB backends - in progress.
2023-02-14 09:33:04 +00:00
Pavel Marek
1f8511dab2
Add Comparator conversion for all types (#4067)
Add `Comparator` type class emulation for all types. Migrate all the types in stdlib to this new `Comparator` API. The main documentation is in `Ordering.enso`.

Fixes these pivotals:
- https://www.pivotaltracker.com/story/show/183945328
- https://www.pivotaltracker.com/story/show/183958734
- https://www.pivotaltracker.com/story/show/184380208

# Important Notes
- The new Comparator API forces users to specify both `equals` and `hash` methods on their custom comparators.
- All the `compare_to` overrides were replaced by definition of a custom _ordered_ comparator.
- All the call sites of `x.compare_to y` method were replaced with `Ordering.compare x y`.
- `Ordering.compare` is essentially a shortcut for `Comparable.from x . compare x y`.
- The default comparator for `Any` is `Default_Unordered_Comparator`, which just forwards to the builtin `EqualsNode` and `HashCodeNode` nodes.
- For `x`, one can get its hash with `Comparable.from x . hash x`.
- This makes `hash` as _hidden_ as possible. There are no other public methods to get a hash code of an object.
- Comparing `x` and `y` can be done either by `Ordering.compare x y` or `Comparable.from x . compare x y` instead of `x.compare_to y`.
2023-02-10 09:22:11 +00:00
James Dunkerley
1c821e22cf
Some fixed form the Anagrams experiment. (#5592)
- Fixes the display of Date, Time_Of_Day and Date_Time so doesn't wrap.
- Adjust serialization of large integer values for JS and display within table.
- Workaround for issue with using `.lines` in the Table (new bug filed).
- Disabled warning on no specified `separator` on `Concatenate`.

Does not include fix for aggregation on integer values outside of `long` range.
2023-02-08 22:17:00 +00:00
Radosław Waśko
4f90946d1e
Rework Invalid Aggregations (#5579)
Closes #5108
2023-02-08 18:39:09 +00:00
Radosław Waśko
3c72ab08c4
Review Missing_Input_Column and Column_Index_Out_Of_Range warnings (#4118)
Implements https://www.pivotaltracker.com/story/show/184226383
2023-02-06 19:52:25 +00:00
James Dunkerley
0790ce494f
New set function, parse a column (#4097)
- New `set` function design - takes a `Column` and works with that more easily and supports control of `Set_Mode`.
- New simple `parse` API on `Column`.
- Separated expression support for `filter` to new `filter_by_expression` on `Table`.
- New `compute` function allowing creation of a column from an expression.
- Added case sensitivity argument to `Column` based on `starts_with`, `ends_with` and `contains`.
- Added case sensitivity argument to `Filter_Condition` for `Starts_With`, `Ends_With`, `Contains` and `Not_Contains`.
- Fixed the issue in JS Table visualisation where JavaScript date was incorrectly set.
- Some dynamic dropdown expressions - experimenting with ways to use them.
- Fixed issue with `.pretty` that wasn't escaping `\`.
- Changed default Postgres DB to `postgres`.
- Fixed SQLite support for starts_with, ends_with and contains to be consistent (using GLOB not LIKE).
2023-01-31 20:48:16 +00:00
Radosław Waśko
c965ad3455
Review Table.order_by (#4104) 2023-01-31 18:29:02 +00:00
Radosław Waśko
b9dbfd036f
First steps of the Problem Handling refactor to the new design (#4086)
Implements:
- https://www.pivotaltracker.com/story/show/184226137
- https://www.pivotaltracker.com/story/show/184226434
- https://www.pivotaltracker.com/story/show/184226462
2023-01-30 16:48:06 +00:00
Radosław Waśko
778d28fba3
Table with no columns is not valid, No_Output_Columns is always an error (#4073)
Implements https://www.pivotaltracker.com/story/show/184226020
2023-01-25 02:40:23 +00:00
Radosław Waśko
d2e57edc8b
Add Table.cross_join and Table.zip to In-Memory Table (#4063)
Implements https://www.pivotaltracker.com/story/show/184239059
2023-01-23 13:19:52 +00:00
Pavel Marek
fcc2163ae3
All Enso objects are hasheable (#3878)
* Hash codes prototype

* Remove Any.hash_code

* Improve caching of hashcode in atoms

* [WIP] Add Hash_Map type

* Implement Any.hash_code builtin for primitives and vectors

* Add some values to ValuesGenerator

* Fix example docs on Time_Zone.new

* [WIP] QuickFix for HashCodeTest before PR #3956 is merged

* Fix hash code contract in HashCodeTest

* Add times and dates values to HashCodeTest

* Fix docs

* Remove hashCodeForMetaInterop specialization

* Introduce snapshoting of HashMapBuilder

* Add unit tests for EnsoHashMap

* Remove duplicate test in Map_Spec.enso

* Hash_Map.to_vector caches result

* Hash_Map_Spec is a copy of Map_Spec

* Implement some methods in Hash_Map

* Add equalsHashMaps specialization to EqualsAnyNode

* get and insert operations are able to work with polyglot values

* Implement rest of Hash_Map API

* Add test that inserts elements with keys with same hash code

* EnsoHashMap.toDisplayString use builder storage directly

* Add separate specialization for host objects in EqualsAnyNode

* Fix specialization for host objects in EqualsAnyNode

* Add polyglot hash map tests

* EconomicMap keeps reference to EqualsNode and HashCodeNode.

Rather than passing these nodes to `get` and `insert` methods.

* HashMapTest run in polyglot context

* Fix containsKey index handling in snapshots

* Remove snapshots field from EnsoHashMapBuilder

* Prepare polyglot hash map handling.

- Hash_Map builtin methods are separate nodes

* Some bug fixes

* Remove ForeignMapWrapper.

We would have to wrap foreign maps in assignments for this to be efficient.

* Improve performance of Hash_Map.get_builtin

Also, if_nothing parameter is suspended

* Remove to_flat_vector.

Interop API requires nested vector (our previous to_vector implementation). Seems that I have misunderstood the docs  the first time I read it.

- to_vector does not sort the vector by keys by default

* Fix polyglot hash maps method dispatch

* Add tests that effectively test hash code implementation.

Via hash map that behaves like a hash set.

* Remove Hashcode_Spec

* Add some polyglot tests

* Add Text.== tests for NFD normalization

* Fix NFD normalization bug in Text.java

* Improve performance of EqualsAnyNode.equalsTexts specialization

* Properly compute hash code for Atom and cache it

* Fix Text specialization in HashCodeAnyNode

* Add Hash_Map_Spec as part of all tests

* Remove HashMapTest.java

Providing all the infrastructure for all the needed Truffle nodes is no longer manageable.

* Remove rest of identityHashCode message implementations

* Replace old Map with Hash_Map

* Add some docs

* Add TruffleBoundaries

* Formatting

* Fix some tests to accept unsorted vector from Map.to_vector

* Delete Map.first and Map.last methods

* Add specialization for big integer hash

* Introduce proper HashCodeTest and EqualsTest.

- Use jUnit theories.
- Call nodes directly

* Fix some specializations for primitives in HashCodeAnyNode

* Fix host object specialization

* Remove Any.hash_code

* Fix import in Map.enso

* Update changelog

* Reformat

* Add truffle boundary to BigInteger.hashCode

* Fix performance of HashCodeTest - initialize DataPoints just once

* Fix MetaIsATest

* Fix ValuesGenerator.textual - Java's char is not Text

* Fix indent in Map_Spec.enso

* Add maps to datapoints in HashCodeTest

* Add specialization for maps in HashCodeAnyNode

* Add multiLevelAtoms to ValuesGenerator

* Provide a workaround for non-linear key inserts

* Fix specializations for double and BigInteger

* Cosmetics

* Add truffle boundaries

* Add allowInlining=true to some truffle boundaries.

Increases performance a lot.

* Increase the size of vectors, and warmup time for Vector.Distinct benchmark

* Various small performance fixes.

* Fix Geo_Spec tests to accept unsorted Map.to_vector

* Implement Map.remove

* FIx Visualization tests to accept unsorted Map.to_vector

* Treat java.util.Properties as Map

* Add truffle boundaries

* Invoke polyglot methods on java.util.Properties

* Ignore python tests if python lang is missing
2023-01-19 10:33:25 +01:00
James Dunkerley
48e5ed9eea
Some little bits from Book Club week 1 (#4058)
- Add `get` to Table.
- Correct `Count Nothing` examples.
- Add `join` to File.
- Add `File_Format.all` listing all installed formats.
- Add some more ALIAS entries.
2023-01-18 11:46:13 +00:00
Radosław Waśko
8853053020
Division in Columns within InDB is integer based if both columns are integers (#4057)
Fixes https://www.pivotaltracker.com/story/show/184073099

# Important Notes
- Since now the only operator on columns for division, `/`, returns floats, it may be worth creating an additional `div` operator exposing integer division. But that will be done as a separate task aligning column operator APIs.
2023-01-17 20:29:25 +00:00
Radosław Waśko
082e0bfd0d
Add Table.union to the In-Memory Table. (#4052)
Implements https://www.pivotaltracker.com/story/show/183854144
2023-01-17 00:34:57 +00:00
James Dunkerley
c4c35c92b7
Align Vector API with design, add some extra functions from AoC (#4026)
**Vector**
- Adjusted `Vector.sort` to be `Vector.sort order on by`.
- Adjusted other sort to use `order` for direction argument.
- Added `insert`, `remove`, `index_of` and `last_index_of` to `Vector`.
- Added `start` and `if_missing` arguments to `find` on `Vector`, and adjusted default is `Not_Found` error.
- Added type checking to `+` on `Vector`.
- Altered `first`, `second` and `last` to error with `Index_Out_Of_Bounds` on `Vector`.
- Removed `sum`, `exists`, `head`, `init`, `tail`, `rest`, `append`, `prepend` from `Vector`.

**Pair**
- Added `last`, `any`, `all`, `contains`, `find`, `index_of`, `last_index_of`, `reverse`, `each`, `fold` and `reduce` to `Pair`.
- Added `get` to `Pair`.

**Range**
- Added `first`, `second`, `index_of`, `last_index_of`, `reverse` and `reduce` to `Range`.
- Added `at` and `get` to `Range`.
- Added `start` and `if_missing` arguments to `find` on `Range`.
- Simplified `last` and `length` of `Range`.
- Removed `exists` from `Range`.

**List**
- Added `second`, `find`, `index_of`, `last_index_of`, `reverse` and `reduce` to `Range`.
- Added `at` and `get` to `List`.
- Removed `exists` from `List`.
- Made `all` short-circuit if any fail on `List`.
- Altered `is_empty` to not compute the length of `List`.
- Altered `first`, `tail`, `head`, `init` and `last` to error with `Index_Out_Of_Bounds` on `List`.

**Others**
- Added `first`, `second`, `last`, `get` to `Text`.
- Added wrapper methods to the Random_Number_Generator so you can get random values more easily.
- Adjusted `Aggregate_Column` to operate on the first column by default.
- Added `contains_key` to `Map`.
- Added ALIAS to `row_count` and `order_by`.
2023-01-12 13:32:24 +00:00
Radosław Waśko
0088096a58
Implement Distinct for the Database backends (#4027)
Implements https://www.pivotaltracker.com/story/show/182307281
2023-01-11 22:46:54 +00:00
Radosław Waśko
8c661fdb74
Database Joins (#4007)
Implements https://www.pivotaltracker.com/story/show/184032869

# Important Notes
- Currently we get failures in Full joins on Postgres which show a more serious problem - amending equality to ensure that `[NULL = NULL] == True` breaks hash/merge based indexing - so such joins will be extremely inefficient. All our joins currently rely on this notion of equality which will mean all of our DB joins will be extremely inefficient.
- We need to find a solution that will support nulls and still work OK with indices (but after exploring a few approaches: `COALESCE(a = b, a IS NULL AND b is NULL)`, `a IS NOT DISTINCT FROM b`, `(a = b) OR (a IS NULL AND b is NULL)`; all of which did not work (they all result in `ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions`) I'm less certain that it is possible. Alternatively, we may need to change the NULL semantics to align it with SQL - this seems like likely the simpler solution, allowing us to generate simple, reliable SQL - the NULL=NULL solution will be cornering us into nasty workarounds very dependent on the particular backend.
2023-01-05 10:36:22 +00:00