Commit Graph

22 Commits

Author SHA1 Message Date
James Dunkerley
4c0647ea29
Stop publishing First/Last as constructors and use auto-scoping for take and drop. (#10467)
- Removes `First` and `Last` from the `Standard.Base` exports.
- Enable auto-scoping for all `Index_Sub_Range` and `Text_Sub_Range`.
- Update all use of those methods to use auto-scoping.
2024-07-08 10:26:30 +00:00
Pavel Marek
21e1284086
Enso tests can be run with filter from cmdline (#9065)
Simplify the `Test.Suite.run_with_filter` to accept a single filter parameter that searches for all the groups and specs that matches that filter. This filter can be a simple text provided from the command line.

# Important Notes
- Pending groups are now printed at the end of the run
- `Test.Suite.run_with_filter` is simplified to accept a single filter parameter that is either `Text` or `Nothing`. See the docs.
- Passing a filter from the command line is therefore straightforward, it is treated as a regex.
- For convenience, I have left all the `main` methods in all the test sources. I have just refactored them to accept the `filter` argument from the command line.
- For example, to run only a single spec from `Vector_Spec.enso`, invoke `enso --run test/Base_Tests/src/Data/Vector_Spec.enso "should allow vector creation with a programmatic constructor"`
- **Majority of the PR is a regex replace** of `^main =` for `main filter=Nothing =` and of `suite.run_with_filter` for `suite.run_with_filter filter`.
- **Fixed some internal engine bugs:**
- `AtomWithHole` allows to specify only one hole - https://github.com/enso-org/enso/pull/9065/files#diff-0f7bb7e85cf86a965de133aa7e6b5958ceb889bd1921c01e00d3a9ceb19626ef
- NaN keys in hash maps are handled in polyglot maps as well - c5257f6c2b78f893214ff67300893b593ea05e21..db4b3c0e9828ee79208d52e02586b24bb845b0d6
2024-02-22 12:31:44 +00:00
Pavel Marek
f3f0697d56
Merge Test_New into Test (#8991)
Merges the temporary `Test_New` library into `Test`. This is the last PR in the series of PRs that refactor all the stdlib tests to the builder API.
2024-02-08 11:25:13 +00:00
Pavel Marek
83fffd9c05
Refactor stdlib tests to the builder API (#8968)
Follow-up of #8890

Refactor the rest of the tests to the builder API (`Test_New`):
- `Image_Tests`
- `Geo_Tests`
- `Google_Api_Test`
- `Examples_Test`
- `AWS_Tests`
- `Meta_Test_Suite_Tests`
- `Visualization_Tests`

# Important Notes
- Unrelated: Fix NPE in `File.new "/" . name`
2024-02-07 13:22:17 +00:00
Jaroslav Tulach
81f06456bf
400x faster with linear hashing of the hash map entries (#8425)
Fixes #5233 by removing `EconomicMap` & co. and using plain old good _linear hashing_. Fixes #8090 by introducing `StorageEntry.removed()` rather than copying the builder on each removal.
2023-12-01 06:43:13 +00:00
Radosław Waśko
b9dbfd036f
First steps of the Problem Handling refactor to the new design (#4086)
Implements:
- https://www.pivotaltracker.com/story/show/184226137
- https://www.pivotaltracker.com/story/show/184226434
- https://www.pivotaltracker.com/story/show/184226462
2023-01-30 16:48:06 +00:00
Radosław Waśko
778d28fba3
Table with no columns is not valid, No_Output_Columns is always an error (#4073)
Implements https://www.pivotaltracker.com/story/show/184226020
2023-01-25 02:40:23 +00:00
Pavel Marek
fcc2163ae3
All Enso objects are hasheable (#3878)
* Hash codes prototype

* Remove Any.hash_code

* Improve caching of hashcode in atoms

* [WIP] Add Hash_Map type

* Implement Any.hash_code builtin for primitives and vectors

* Add some values to ValuesGenerator

* Fix example docs on Time_Zone.new

* [WIP] QuickFix for HashCodeTest before PR #3956 is merged

* Fix hash code contract in HashCodeTest

* Add times and dates values to HashCodeTest

* Fix docs

* Remove hashCodeForMetaInterop specialization

* Introduce snapshoting of HashMapBuilder

* Add unit tests for EnsoHashMap

* Remove duplicate test in Map_Spec.enso

* Hash_Map.to_vector caches result

* Hash_Map_Spec is a copy of Map_Spec

* Implement some methods in Hash_Map

* Add equalsHashMaps specialization to EqualsAnyNode

* get and insert operations are able to work with polyglot values

* Implement rest of Hash_Map API

* Add test that inserts elements with keys with same hash code

* EnsoHashMap.toDisplayString use builder storage directly

* Add separate specialization for host objects in EqualsAnyNode

* Fix specialization for host objects in EqualsAnyNode

* Add polyglot hash map tests

* EconomicMap keeps reference to EqualsNode and HashCodeNode.

Rather than passing these nodes to `get` and `insert` methods.

* HashMapTest run in polyglot context

* Fix containsKey index handling in snapshots

* Remove snapshots field from EnsoHashMapBuilder

* Prepare polyglot hash map handling.

- Hash_Map builtin methods are separate nodes

* Some bug fixes

* Remove ForeignMapWrapper.

We would have to wrap foreign maps in assignments for this to be efficient.

* Improve performance of Hash_Map.get_builtin

Also, if_nothing parameter is suspended

* Remove to_flat_vector.

Interop API requires nested vector (our previous to_vector implementation). Seems that I have misunderstood the docs  the first time I read it.

- to_vector does not sort the vector by keys by default

* Fix polyglot hash maps method dispatch

* Add tests that effectively test hash code implementation.

Via hash map that behaves like a hash set.

* Remove Hashcode_Spec

* Add some polyglot tests

* Add Text.== tests for NFD normalization

* Fix NFD normalization bug in Text.java

* Improve performance of EqualsAnyNode.equalsTexts specialization

* Properly compute hash code for Atom and cache it

* Fix Text specialization in HashCodeAnyNode

* Add Hash_Map_Spec as part of all tests

* Remove HashMapTest.java

Providing all the infrastructure for all the needed Truffle nodes is no longer manageable.

* Remove rest of identityHashCode message implementations

* Replace old Map with Hash_Map

* Add some docs

* Add TruffleBoundaries

* Formatting

* Fix some tests to accept unsorted vector from Map.to_vector

* Delete Map.first and Map.last methods

* Add specialization for big integer hash

* Introduce proper HashCodeTest and EqualsTest.

- Use jUnit theories.
- Call nodes directly

* Fix some specializations for primitives in HashCodeAnyNode

* Fix host object specialization

* Remove Any.hash_code

* Fix import in Map.enso

* Update changelog

* Reformat

* Add truffle boundary to BigInteger.hashCode

* Fix performance of HashCodeTest - initialize DataPoints just once

* Fix MetaIsATest

* Fix ValuesGenerator.textual - Java's char is not Text

* Fix indent in Map_Spec.enso

* Add maps to datapoints in HashCodeTest

* Add specialization for maps in HashCodeAnyNode

* Add multiLevelAtoms to ValuesGenerator

* Provide a workaround for non-linear key inserts

* Fix specializations for double and BigInteger

* Cosmetics

* Add truffle boundaries

* Add allowInlining=true to some truffle boundaries.

Increases performance a lot.

* Increase the size of vectors, and warmup time for Vector.Distinct benchmark

* Various small performance fixes.

* Fix Geo_Spec tests to accept unsorted Map.to_vector

* Implement Map.remove

* FIx Visualization tests to accept unsorted Map.to_vector

* Treat java.util.Properties as Map

* Add truffle boundaries

* Invoke polyglot methods on java.util.Properties

* Ignore python tests if python lang is missing
2023-01-19 10:33:25 +01:00
Radosław Waśko
8c661fdb74
Database Joins (#4007)
Implements https://www.pivotaltracker.com/story/show/184032869

# Important Notes
- Currently we get failures in Full joins on Postgres which show a more serious problem - amending equality to ensure that `[NULL = NULL] == True` breaks hash/merge based indexing - so such joins will be extremely inefficient. All our joins currently rely on this notion of equality which will mean all of our DB joins will be extremely inefficient.
- We need to find a solution that will support nulls and still work OK with indices (but after exploring a few approaches: `COALESCE(a = b, a IS NULL AND b is NULL)`, `a IS NOT DISTINCT FROM b`, `(a = b) OR (a IS NULL AND b is NULL)`; all of which did not work (they all result in `ERROR: FULL JOIN is only supported with merge-joinable or hash-joinable join conditions`) I'm less certain that it is possible. Alternatively, we may need to change the NULL semantics to align it with SQL - this seems like likely the simpler solution, allowing us to generate simple, reliable SQL - the NULL=NULL solution will be cornering us into nasty workarounds very dependent on the particular backend.
2023-01-05 10:36:22 +00:00
James Dunkerley
ace459ed53
Let JavaScript parse JSON and write JSON ... (#3987)
Use JavaScript to parse and serialise to JSON. Parses to native Enso object.
- `.to_json` now returns a `Text` of the JSON.
- Json methods now `parse`, `stringify` and `from_pairs`.
- New `JSON_Object` representing a JavaScript Object.
- `.to_js_object` allows for types to custom serialize. Returning a `JS_Object`.
- Default JSON format for Atom now has a `type` and `constructor` property (or method to call for as needed to deserialise).
- Removed `.into` support for now.
- Added JSON File Format and SPI to allow `Data.read` to work.
- Added `Data.fetch` API for easy Web download.
- Default visualization for JS Object trunctes, and made Vector default truncate children too.

Fixes defect where types with no constructor crashed on `to_json` (e.g. `Matching_Mode.Last.to_json`.
Adjusted default visualisation for Vector, so it doesn't serialise an array of arrays forever.
Likewise, JS_Object default visualisation is truncated to a small subset.

New convention:
- `.get` returns `Nothing` if a key or index is not present. Takes an `other` argument allowing control of default.
- `.at` error if key or index is not present.
- `Nothing` gains a `get` method allowing for easy propagation.
2022-12-20 10:33:46 +00:00
Hubert Plociniczak
06bd69436b
Import modules' extension methods only with unqualified import statements (#3906)
# Important Notes
Note that one cannot
```
import Standard.Table as Table_Module
```
because of the 2-component name restriction that gets desugared to `Standard.Table.Main` and we have to write
```
import Standard.Table.Main as Table_Module
```
in a few places. Once we move `Json.to_table` extension this can be improved.
2022-12-01 10:13:34 +00:00
James Dunkerley
701c644d0e
Tidy up the remaining ones except Base... (#3797)
- Removed `Dubious constructor export` from Examples, Geo, Google_Api, Image and Test.
- Updated Google_Api project to meet newer code standards.
- Restructured `Standard.Test`:
- `Main.enso` now exports `Bench`, `Faker`, `Problems`, `Test`, `Test_Suite`
- `Test.Suite` methods moved into a `Test_Suite` type.
- Moved `Bench.measure` into `Bench` type.
- Separated the reporting to a `Test_Reporter` module.
- Moved `Faker` methods into `Faker` type.
- Removed `Verbs` and `.should` method.
- Added `should_start_with` and `should_contain` extensions to `Any`.
- Restructured `Standard.Image`:
- Merged Codecs methods into `Image`.
- Export `Image`, `Read_Flag`, `Write_Flag` and `Matrix` as types from `Main.enso`.
- Merged the internal methods into `Matrix` and `Image`.
- Fixed `Day_Of_Week` to be exported as a type and sort the `from` method.
2022-10-17 11:27:27 +00:00
James Dunkerley
185378f07c
Moving library statics to type for Table. (#3760)
- Generally export types not modules from the `Standard.Table` import.
- Moved `new`, `from_rows` the `Standard.Table` library into the `Table` type.
- Renames `Standard.Table.Data.Storage.Type` to `Standard.Table.Data.Storage.Storage`
- Removed the internal `from_columns` method.
- Removed `join` and `concat` and merged into instance methods.
- Removed `Table` and `Column` from the `Standard.Database` exports.
- Removed `Standard.Table.Data.Column.Aggregate_Column` as not used any more.
2022-10-06 17:01:18 +00:00
Hubert Plociniczak
0e5df935d3
Don't rename imported Main module that only imports names (#3710)
Turns that if you import a two-part import we had special code that would a) add Main submodule b) add an explicit rename.

b) is problematic because sometimes we only want to import specific names.
E.g.,
```
from Bar.Foo import Bar, Baz
```
would be translated to
```
from Bar.Foo.Main as Foo import Bar, Baz
```
and it should only be translated to
```
from Bar.Foo.Main import Bar, Baz
```

This change detects this scenario and does not add renames in that case.

Fixes [183276486](https://www.pivotaltracker.com/story/show/183276486).
2022-09-16 13:01:06 +00:00
James Dunkerley
0126f02e7b
Restructure File.read into the new design (#3701)
Changes following Marcin's work. Should be back to very similar public API as before.

- Add an "interface" type: `Standard.Base.System.File_Format.File_Format`.
- All `File_Format` types now have a `can_read` method to decide if they can read a file.
- Move `Standard.Table.IO.File_Format.Text.Text_Data` to `Standard.Base.System.File_Format.Plain_Text_Format.Plain_Text`.
- Move `Standard.Table.IO.File_Format.Bytes` to `Standard.Base.System.File_Format.Bytes`.
- Move `Standard.Table.IO.File_Format.Infer` to `Standard.Base.System.File_Format.Infer`. **(doesn't belong here...)**
- Move `Standard.Table.IO.File_Format.Unsupported_File_Type` to `Standard.Base.Error.Common.Unsupported_File_Type`.
- Add `Infer`, `File_Format`, `Bytes`, `Plain_Text`, `Plain_Text_Format` to `Standard.Base` exports.
- Fold extension methods of `Standard.Base.Meta.Unresolved_Symbol` into type.
- Move `Standard.Table.IO.File_Format.Auto` to `Standard.Table.IO.Auto_Detect.Auto_Detect`.
- Added a `types` Vector of all the built in formats.
- `Auto_Detect` asks each type if they `can_read` a file.
- Broke up and moved `Standard.Table.IO.Excel` into `Standard.Table.Excel`:
- Moved `Standard.Table.IO.File_Format.Excel.Excel_Data` to `Standard.Table.Excel.Excel_Format.Excel_Format.Excel`.
- Renamed `Sheet` to `Worksheet`.
- Internal types `Reader` and `Writer` providing the actual read and write methods.
- Created `Standard.Table.Delimited` with similar structure to `Standard.Table.Excel`:
- Moved `Standard.Table.IO.File_Format.Delimited.Delimited_Data` to `Standard.Table.Delimited.Delimited_Format.Delimited_Format.Delimited`.
- Moved `Standard.Table.IO.Quote_Style` to `Standard.Table.Delimited.Quote_Style`.
- Moved the `Reader` and `Writer` internal types into here. Renamed methods to have unique names.
- Add `Aggregate_Column`, `Auto_Detect`, `Delimited`, `Delimited_Format`, `Excel`, `Excel_Format`, `Sheet_Names`, `Range_Names`, `Worksheet` and `Cell_Range` to `Standard.Table` exports.
2022-09-15 14:48:46 +00:00
Jaroslav Tulach
2b9352d2fc
Lazy scatterplot for Vector & Table (#3655)
First of all this PR demonstrates how to implement _lazy visualization_:
- one needs to write/enhance Enso visualization libraries - this PR adds two optional parameters (`bounds` and `limit`) to `process_to_json_text` function.
- the `process_to_json_text` can be tested by standard Enso test harness which this PR also does
- then one has to modify JavaScript on the IDE side to construct `setPreprocessor` expression using the optional parameters

The idea of _scatter plot lazy visualization_ is to limit the amount of points the IDE requests. Initially the limit is set to `limit=1024`. The `Scatter_Plot.enso` then processes the data and selects/generates the `limit` subset. Right now it includes `min`, `max` in both `x`, `y` axis plus randomly chosen points up to the `limit`.

![Zooming In](https://user-images.githubusercontent.com/26887752/185336126-f4fbd914-7fd8-4f0b-8377-178095401f46.png)

The D3 visualization widget is capable of _zooming in_. When that happens the JavaScript widget composes new expression with `bounds` set to the newly visible area. By calling `setPreprocessor` the engine recomputes the visualization data, filters out any data outside of the `bounds` and selects another `limit` points from the new data. The IDE visualization then updates itself to display these more detailed data. Users can zoom-in to see the smallest detail where the number of points gets bellow `limit` or they can select _Fit all_ to see all the data without any `bounds`.

# Important Notes
Randomly selecting `limit` samples from the dataset may be misleading. Probably implementing _k-means clustering_ (where `k=limit`) would generate more representative approximation.
2022-08-23 12:12:22 +00:00
James Dunkerley
a54a7d5553
Tidying up what is in Standard.Base (#3603)
- Added various of the types from the new APIs to the Standard.Base export.
- Removed Syntax_Error types for Regex and Uri and used the common one.
2022-07-27 13:28:00 +00:00
Radosław Waśko
0ea5dc2a6f
Data analysts should be able to use Text.replace to substitute parts of the text (#3393)
Implements https://www.pivotaltracker.com/story/show/181266274
2022-04-13 19:21:47 +00:00
Marcin Kostrzewa
334a022ffd
Import syntax including namespace (#1806) 2021-06-24 12:42:24 +02:00
Ara Adkins
3080d8f6f7
Add .sum to Vector (#1702) 2021-04-28 10:47:57 +01:00
Michał Wawrzyniec Urbańczyk
8d77a565eb
Case Insensitive Dataframe Support in Visualizations (#1634)
Ref https://github.com/enso-org/ide/issues/1391
2021-04-01 10:05:17 +02:00
Michał Wawrzyniec Urbańczyk
5b57960da3
Histogram and Scatterplot visualizations support for Table (#1608) 2021-03-25 17:47:22 +01:00