Add parser for line by line processing (#8719)

- Linting fixes and groups.
- Add `File.from that:Text` and use `File` conversions instead of taking both `File` and `Text` and calling `File.new`.
- Align Unix Epoc with the UTC timezone and add converting from long value to `Date_Time` using it.
- Add simple first logging API allowing writing to log messages from Enso.
- Fix minor style issue where a test type had a empty constructor.
- Added a `long` based array builder.
- Added `File_By_Line` to read a file line by line.
- Added "fast" JSON parser based off Jackson.
- Altered range `to_vector` to be a proxy Vector.
- Added `at` and `get` to `Database.Column`.
- Added `get` to `Table.Column`.
- Added ability to expand `Vector`, `Array` `Range`, `Date_Range` to columns.
- Altered so `expand_to_column` default column name will be the same as the input column (i.e. no `Value` suffix).
- Added ability to expand `Map`, `JS_Object` and `Jackson_Object` to rows with two columns coming out (and extra key column).
-  Fixed bug where couldn't use integer index to expand to rows.
This commit is contained in:
James Dunkerley 2024-02-01 07:29:50 +00:00 committed by GitHub
parent bb8ff8f89e
commit eeaddbc434
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
92 changed files with 1878 additions and 199 deletions

View File

@ -609,6 +609,8 @@
join operations.][8849]
- [Attach a warning when Nothing is used as a value in a comparison or `is_in`
`Filter_Condition`.][8865]
- [Added `File_By_Line` type allowing processing a file line by line. New faster
JSON parser based off Jackson.][8719]
[debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
@ -872,6 +874,7 @@
[8606]: https://github.com/enso-org/enso/pull/8606
[8627]: https://github.com/enso-org/enso/pull/8627
[8691]: https://github.com/enso-org/enso/pull/8691
[8719]: https://github.com/enso-org/enso/pull/8719
[8816]: https://github.com/enso-org/enso/pull/8816
[8849]: https://github.com/enso-org/enso/pull/8849
[8865]: https://github.com/enso-org/enso/pull/8865

View File

@ -2598,8 +2598,9 @@ lazy val `std-base` = project
Compile / packageBin / artifactPath :=
`base-polyglot-root` / "std-base.jar",
libraryDependencies ++= Seq(
"org.graalvm.polyglot" % "polyglot" % graalMavenPackagesVersion,
"org.netbeans.api" % "org-openide-util-lookup" % netbeansApiVersion % "provided"
"org.graalvm.polyglot" % "polyglot" % graalMavenPackagesVersion,
"org.netbeans.api" % "org-openide-util-lookup" % netbeansApiVersion % "provided",
"com.fasterxml.jackson.core" % "jackson-databind" % jacksonVersion
),
Compile / packageBin := Def.task {
val result = (Compile / packageBin).value

View File

@ -1,6 +1,21 @@
Enso
Copyright 2020 - 2024 New Byte Order sp. z o. o.
'jackson-annotations', licensed under the The Apache Software License, Version 2.0, is distributed with the Base.
The license file can be found at `licenses/APACHE2.0`.
Copyright notices related to this dependency can be found in the directory `com.fasterxml.jackson.core.jackson-annotations-2.15.2`.
'jackson-core', licensed under the The Apache Software License, Version 2.0, is distributed with the Base.
The license file can be found at `licenses/APACHE2.0`.
Copyright notices related to this dependency can be found in the directory `com.fasterxml.jackson.core.jackson-core-2.15.2`.
'jackson-databind', licensed under the The Apache Software License, Version 2.0, is distributed with the Base.
The license file can be found at `licenses/APACHE2.0`.
Copyright notices related to this dependency can be found in the directory `com.fasterxml.jackson.core.jackson-databind-2.15.2`.
'icu4j', licensed under the Unicode/ICU License, is distributed with the Base.
The license information can be found along with the copyright notices.
Copyright notices related to this dependency can be found in the directory `com.ibm.icu.icu4j-73.1`.

View File

@ -0,0 +1,21 @@
# Jackson JSON processor
Jackson is a high-performance, Free/Open Source JSON processing library.
It was originally written by Tatu Saloranta (tatu.saloranta@iki.fi), and has
been in development since 2007.
It is currently developed by a community of developers.
## Copyright
Copyright 2007-, Tatu Saloranta (tatu.saloranta@iki.fi)
## Licensing
Jackson 2.x core and extension components are licensed under Apache License 2.0
To find the details that apply to this artifact see the accompanying LICENSE file.
## Credits
A list of contributors may be found from CREDITS(-2.x) file, which is included
in some artifacts (usually source distributions); but is always available
from the source code management (SCM) system project uses.

View File

@ -0,0 +1,26 @@
/*
* Copyright 2018-2020 Raffaello Giulietti
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
/* Jackson JSON-processor.
*
* Copyright (c) 2007- Tatu Saloranta, tatu.saloranta@iki.fi
*/

View File

@ -0,0 +1,32 @@
# Jackson JSON processor
Jackson is a high-performance, Free/Open Source JSON processing library.
It was originally written by Tatu Saloranta (tatu.saloranta@iki.fi), and has
been in development since 2007.
It is currently developed by a community of developers.
## Copyright
Copyright 2007-, Tatu Saloranta (tatu.saloranta@iki.fi)
## Licensing
Jackson 2.x core and extension components are licensed under Apache License 2.0
To find the details that apply to this artifact see the accompanying LICENSE file.
## Credits
A list of contributors may be found from CREDITS(-2.x) file, which is included
in some artifacts (usually source distributions); but is always available
from the source code management (SCM) system project uses.
## FastDoubleParser
jackson-core bundles a shaded copy of FastDoubleParser <https://github.com/wrandelshofer/FastDoubleParser>.
That code is available under an MIT license <https://github.com/wrandelshofer/FastDoubleParser/blob/main/LICENSE>
under the following copyright.
Copyright © 2023 Werner Randelshofer, Switzerland. MIT License.
See FastDoubleParser-NOTICE for details of other source code included in FastDoubleParser
and the licenses and copyrights that apply to that code.

View File

@ -0,0 +1,21 @@
# Jackson JSON processor
Jackson is a high-performance, Free/Open Source JSON processing library.
It was originally written by Tatu Saloranta (tatu.saloranta@iki.fi), and has
been in development since 2007.
It is currently developed by a community of developers.
## Copyright
Copyright 2007-, Tatu Saloranta (tatu.saloranta@iki.fi)
## Licensing
Jackson 2.x core and extension components are licensed under Apache License 2.0
To find the details that apply to this artifact see the accompanying LICENSE file.
## Credits
A list of contributors may be found from CREDITS(-2.x) file, which is included
in some artifacts (usually source distributions); but is always available
from the source code management (SCM) system project uses.

View File

@ -0,0 +1,31 @@
/*
* Copyright 2010 Google Inc. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/*
* Copyright 2011 Google Inc. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

View File

@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@ -309,7 +309,8 @@ type Any
if_nothing self ~other =
const self other
## If `self` is Nothing then returns Nothing, otherwise returns the result
## GROUP Logical
If `self` is Nothing then returns Nothing, otherwise returns the result
of running the provided `action`.
> Example

View File

@ -145,9 +145,9 @@ read_text path (encoding=Encoding.utf_8) (on_problems=Problem_Behavior.Report_Wa
example_list_files =
Data.list_directory Examples.data_dir name_filter="**.md" recursive=True
list_directory : (File | Text) -> Text -> Boolean -> Vector File
list_directory directory name_filter=Nothing recursive=False =
File.new directory . list name_filter=name_filter recursive=recursive
list_directory : File -> Text | Nothing -> Boolean -> Vector File
list_directory directory:File name_filter:Text|Nothing=Nothing recursive:Boolean=False =
directory . list name_filter=name_filter recursive=recursive
## ALIAS download, http get
GROUP Input

View File

@ -0,0 +1,208 @@
import project.Any.Any
import project.Data.Array_Proxy.Array_Proxy
import project.Data.Json.Invalid_JSON
import project.Data.Json.JS_Object
import project.Data.Numbers.Integer
import project.Data.Numbers.Number
import project.Data.Pair.Pair
import project.Data.Text.Text
import project.Data.Vector.Vector
import project.Error.Error
import project.Errors.No_Such_Key.No_Such_Key
import project.Metadata.Display
import project.Metadata.Widget
import project.Nothing.Nothing
import project.Panic.Panic
from project.Data.Boolean import Boolean, False, True
from project.Data.Json.Extensions import all
from project.Data.Ordering import all
from project.Data.Text.Extensions import all
from project.Metadata.Choice import Option
from project.Metadata.Widget import Single_Choice
polyglot java import com.fasterxml.jackson.core.JsonProcessingException
polyglot java import com.fasterxml.jackson.databind.JsonNode
polyglot java import com.fasterxml.jackson.databind.ObjectMapper
polyglot java import com.fasterxml.jackson.databind.node.ArrayNode
polyglot java import com.fasterxml.jackson.databind.node.JsonNodeType
polyglot java import com.fasterxml.jackson.databind.node.ObjectNode
## PRIVATE
Jackson-based JSON Parser
type Java_Json
parse : Text -> Nothing | Boolean | Number | Text | Vector | Jackson_Object
parse text:Text =
error_handler js_exception =
Error.throw (Invalid_JSON.Error js_exception.payload.message)
Panic.catch JsonProcessingException handler=error_handler <|
node = ObjectMapper.new.readTree text
read_json_node node
## PRIVATE
Read a JsonNode to an Enso type
read_json_node : JsonNode -> Nothing | Boolean | Number | Text | Vector | Jackson_Object
read_json_node node = case node.getNodeType of
JsonNodeType.NULL -> Nothing
JsonNodeType.BOOLEAN -> node.asBoolean
JsonNodeType.STRING -> node.asText
JsonNodeType.NUMBER ->
if node.isFloatingPointNumber then node.asDouble else node.asLong
JsonNodeType.ARRAY -> read_json_array node
JsonNodeType.OBJECT -> Jackson_Object.new node
## PRIVATE
Read a JsonNode to a Vector
read_json_array : ArrayNode -> Vector
read_json_array node =
proxy = Array_Proxy.new node.size i-> (read_json_node (node.get i))
Vector.from_polyglot_array proxy
## PRIVATE
type Jackson_Object
new : ObjectNode -> Jackson_Object
new object_node =
make_field_names object =
name_iterator = object.fieldNames
builder = Vector.new_builder object.size
loop iterator builder =
if iterator.hasNext.not then builder.to_vector else
builder.append iterator.next
@Tail_Call loop iterator builder
loop name_iterator builder
Jackson_Object.Value object_node (make_field_names object_node)
## PRIVATE
Value object_node ~field_array
## GROUP Logical
Returns True iff the objects contains the given `key`.
contains_key : Text -> Boolean
contains_key self key:Text = self.object_node.has key
## Get a value for a key of the object, or a default value if that key is not present.
Arguments:
- key: The key to get.
- if_missing: The value to return if the key is not found.
@key make_field_name_selector
get : Text -> Any -> Nothing | Boolean | Number | Text | Vector | Jackson_Object
get self key:Text ~if_missing=Nothing =
if self.contains_key key . not then if_missing else
child = self.object_node.get key
read_json_node child
## GROUP Selections
Get a value for a key of the object.
If the key is not found, throws a `No_Such_Key` error.
Arguments:
- key: The key to get.
@key make_field_name_selector
at : Text -> Jackson_Object | Boolean | Number | Nothing | Text | Vector ! No_Such_Key
at self key:Text = self.get key (Error.throw (No_Such_Key.Error self key))
## GROUP Metadata
Get the keys of the object.
field_names : Vector
field_names self = self.field_array
## Maps a function over each value in this object
Arguments:
- function: The function to apply to each value in the map, taking a
value and returning a value.
map : (Any->Any) -> Vector
map self function =
kv_func = _ -> function
self.map_with_key kv_func
## Maps a function over each field-value pair in the object.
Arguments:
- function: The function to apply to each key and value in the map,
taking a key and a value and returning a value.
map_with_key : (Any -> Any -> Any) -> Vector
map_with_key self function =
self.field_names.map key->
value = self.get key
function key value
## GROUP Conversions
Convert the object to a Vector of Pairs.
to_vector : Vector
to_vector self =
keys = self.field_array
proxy = Array_Proxy.new keys.length (i-> [(keys.at i), (self.get (keys.at i))])
Vector.from_polyglot_array proxy
## GROUP Metadata
Gets the number of keys in the object.
length : Number
length self = self.object_node.size
## GROUP Logical
Returns True iff the Map is empty, i.e., does not have any entries.
is_empty : Boolean
is_empty self = self.length == 0
## GROUP Logical
Returns True iff the Map is not empty, i.e., has at least one entry.
not_empty : Boolean
not_empty self = self.is_empty.not
## PRIVATE
Convert the object to a JS_Object.
to_js_object : JS_Object
to_js_object self =
pairs = self.field_names.map name-> [name, self.at name . to_js_object]
JS_Object.from_pairs pairs
## PRIVATE
Convert to a Text.
to_text : Text
to_text self = self.to_json
## PRIVATE
Convert JS_Object to a friendly string.
to_display_text : Text
to_display_text self =
self.to_text.to_display_text
## PRIVATE
Convert to a JSON representation.
to_json : Text
to_json self = self.object_node.toString
## PRIVATE
Make a field name selector
make_field_name_selector : Jackson_Object -> Display -> Widget
make_field_name_selector js_object display=Display.Always =
Single_Choice display=display values=(js_object.field_names.map n->(Option n n.pretty))
## Extension for Text to allow use.
Text.parse_fast_json : Nothing | Boolean | Number | Text | Vector | Jackson_Object
Text.parse_fast_json self = Java_Json.parse self
## PRIVATE
type Jackson_Object_Comparator
## PRIVATE
compare : Jackson_Object -> Jackson_Object -> (Ordering|Nothing)
compare obj1 obj2 =
obj1_keys = obj1.field_names
obj2_keys = obj2.field_names
same_values = obj1_keys.length == obj2_keys.length && obj1_keys.all key->
(obj1.get key == obj2.at key).catch No_Such_Key _->False
if same_values then Ordering.Equal else Nothing
## PRIVATE
hash : Jackson_Object -> Integer
hash obj =
values_hashes = obj.field_names.map field_name->
val = obj.get field_name
Comparable.from val . hash val
# Return sum, as we don't care about ordering of field names
values_hashes.fold 0 (+)
## PRIVATE
Comparable.from (_:Jackson_Object) = Jackson_Object_Comparator

View File

@ -168,18 +168,49 @@ type JS_Object
field_names self =
Vector.from_polyglot_array (get_property_names self.js_object)
## Maps a function over each value in this object
Arguments:
- function: The function to apply to each value in the map, taking a
value and returning a value.
map : (Any->Any) -> Vector
map self function =
kv_func = _ -> function
self.map_with_key kv_func
## Maps a function over each field-value pair in the object.
Arguments:
- function: The function to apply to each key and value in the map,
taking a key and a value and returning a value.
map_with_key : (Any -> Any -> Any) -> Vector
map_with_key self function =
self.field_names.map key->
value = self.get key
function key value
## GROUP Metadata
Gets the number of keys in the object.
length : Number
length self =
get_property_names self.js_object . length
## GROUP Logical
Returns True iff the Map is empty, i.e., does not have any entries.
is_empty : Boolean
is_empty self = self.length == 0
## GROUP Logical
Returns True iff the Map is not empty, i.e., has at least one entry.
not_empty : Boolean
not_empty self = self.is_empty.not
## GROUP Conversions
Convert the object to a Vector of Pairs.
Convert the object to a Vector of Key and Values.
to_vector : Vector
to_vector self =
keys = get_property_names self.js_object
proxy = Array_Proxy.new keys.length (i-> Pair.new (keys.at i) (self.get (keys.at i)))
proxy = Array_Proxy.new keys.length (i-> [(keys.at i), (self.get (keys.at i))])
Vector.from_polyglot_array proxy
## PRIVATE

View File

@ -1,4 +1,5 @@
import project.Any.Any
import project.Data.Array_Proxy.Array_Proxy
import project.Data.Filter_Condition.Filter_Condition
import project.Data.Numbers.Integer
import project.Data.Pair.Pair
@ -502,7 +503,10 @@ type Range
1.up_to 6 . to_vector
to_vector : Vector Integer
to_vector self = self.map x->x
to_vector self =
proxy = Array_Proxy.new self.length self.at
Vector.from_polyglot_array proxy
## Combines all the elements of a non-empty range using a binary operation.
If the range is empty, returns `if_empty`.

View File

@ -1,4 +1,5 @@
import project.Any.Any
import project.Data.Array_Proxy.Array_Proxy
import project.Data.Filter_Condition.Filter_Condition
import project.Data.Json.JS_Object
import project.Data.Numbers.Integer
@ -202,7 +203,9 @@ type Date_Range
(Date.new 2021 05 07).up_to (Date.new 2021 05 10) . to_vector
to_vector : Vector Date
to_vector self = self.map x->x
to_vector self =
proxy = Array_Proxy.new self.length self.at
Vector.from_polyglot_array proxy
## GROUP Logical
Checks if this range is empty.

View File

@ -36,7 +36,7 @@ polyglot java import org.enso.base.Time_Utils
## PRIVATE
unix_epoch_start : Date_Time
unix_epoch_start = Date_Time.new 1970
unix_epoch_start = Date_Time.new 1970 zone=Time_Zone.utc
## PRIVATE
ensure_in_epoch : (Date_Time | Date) -> (Any -> Any) -> Any
@ -281,6 +281,37 @@ type Date_Time
parse text:Text format:Date_Time_Formatter=Date_Time_Formatter.default_enso_zoned_date_time =
format.parse_date_time text
## Creates a new `Date_Time` from a Unix epoch timestamp in seconds (and optional nanoseconds).
Arguments:
- seconds: The number of seconds since the Unix epoch.
- nanoseconds: The number of nanoseconds within the second.
> Example
Create a new `Date_Time` from a Unix epoch timestamp.
from Standard.Base import Date_Time
example_from_unix_epoch = Date_Time.from_unix_epoch_seconds 1601587200
from_unix_epoch_seconds : Integer -> Integer -> Date_Time
from_unix_epoch_seconds seconds:Integer nanoseconds:Integer=0 =
unix_epoch_start + Duration.new seconds=seconds nanoseconds=nanoseconds
## Creates a new `Date_Time` from a Unix epoch timestamp in milliseconds.
Arguments:
- milliseconds: The number of milliseconds since the Unix epoch.
> Example
Create a new `Date_Time` from a Unix epoch timestamp.
from Standard.Base import Date_Time
example_from_unix_epoch = Date_Time.from_unix_epoch_milliseconds 1601587200000
from_unix_epoch_milliseconds : Integer -> Time_Zone -> Date_Time
from_unix_epoch_milliseconds milliseconds:Integer =
unix_epoch_start + Duration.new milliseconds=milliseconds
## GROUP Metadata
Get the year portion of the time.

View File

@ -0,0 +1,39 @@
import project.Any.Any
import project.Data.Text.Text
import project.Meta
import project.Nothing.Nothing
polyglot java import java.util.logging.Logger
## PRIVATE
Log a message.
This function needs to be enabled by importing `Standard.Base.Logging` using
`from Standard.Base.Logging import all`.
Any.log_message : Text -> Log_Level -> Any
Any.log_message self ~message:Text level:Log_Level=Log_Level.Info =
type_name = Meta.get_qualified_type_name self
logger = Logger.getLogger type_name
case level of
Log_Level.Finest -> logger.finest message
Log_Level.Fine -> logger.fine message
Log_Level.Info -> logger.info message
Log_Level.Warning -> logger.warning message
Log_Level.Severe -> logger.severe message
self
## PRIVATE
type Log_Level
## Finest (Trace) level log message.
Finest
## Fine (Debug) level log message.
Fine
## Info level log message.
Info
## Warning level log message.
Warning
## Severe level log message.
Severe

View File

@ -191,11 +191,10 @@ type Response_Body
example_to_file =
Examples.get_geo_data.to_file Examples.scratch_file
to_file : File | Text -> Existing_File_Behavior -> File
to_file self file on_existing_file=Existing_File_Behavior.Backup =
to_file : File -> Existing_File_Behavior -> File
to_file self file:File on_existing_file=Existing_File_Behavior.Backup =
self.with_stream body_stream->
f = File.new file
r = on_existing_file.write f output_stream->
r = on_existing_file.write file output_stream->
output_stream.write_stream body_stream
r.if_not_error file

View File

@ -23,11 +23,10 @@ from project.Data.Text.Extensions import all
polyglot java import java.net.URI as Java_URI
polyglot java import java.net.URISyntaxException
polyglot java import org.graalvm.collections.Pair as Java_Pair
polyglot java import org.enso.base.enso_cloud.EnsoSecretAccessDenied
polyglot java import org.enso.base.net.URITransformer
polyglot java import org.enso.base.net.URIWithSecrets
polyglot java import org.graalvm.collections.Pair as Java_Pair
## Represents a Uniform Resource Identifier (URI) reference.
type URI

View File

@ -31,7 +31,8 @@ type Nothing
if_nothing : Any -> Any
if_nothing self ~function = function
## If `self` is Nothing then returns Nothing, otherwise returns the result
## GROUP Logical
If `self` is Nothing then returns Nothing, otherwise returns the result
of running the provided `action`.
> Example

View File

@ -821,3 +821,7 @@ find_extension_from_name : Text -> Text
find_extension_from_name name =
extension = name.drop (Text_Sub_Range.Before_Last ".")
if extension == "." then "" else extension
## PRIVATE
Convert from a Text to a File.
File.from (that:Text) = File.new that

View File

@ -50,15 +50,14 @@ polyglot java import org.enso.base.Array_Utils
This allows for building the workflow without affecting the real files.
@encoding Encoding.default_widget
Text.write : (File|Text) -> Encoding -> Existing_File_Behavior -> Problem_Behavior -> File ! Encoding_Error | Illegal_Argument | File_Error
Text.write self path encoding=Encoding.utf_8 on_existing_file=Existing_File_Behavior.Backup on_problems=Problem_Behavior.Report_Warning =
Text.write : File -> Encoding -> Existing_File_Behavior -> Problem_Behavior -> File ! Encoding_Error | Illegal_Argument | File_Error
Text.write self path:File encoding=Encoding.utf_8 on_existing_file=Existing_File_Behavior.Backup on_problems=Problem_Behavior.Report_Warning =
bytes = self.bytes encoding on_problems
bytes.if_not_error <|
actual = File.new path
effective_existing_behaviour = on_existing_file.get_effective_behavior actual
file = if Context.Output.is_enabled then actual else
effective_existing_behaviour = on_existing_file.get_effective_behavior path
file = if Context.Output.is_enabled then path else
should_copy_file = on_existing_file==Existing_File_Behavior.Append
actual.create_dry_run_file copy_original=should_copy_file
path.create_dry_run_file copy_original=should_copy_file
Context.Output.with_enabled <|
r = effective_existing_behaviour.write file stream->
@ -97,13 +96,12 @@ Text.write self path encoding=Encoding.utf_8 on_existing_file=Existing_File_Beha
import Standard.Examples
[36, -62, -93, -62, -89, -30, -126, -84, -62, -94].write_bytes Examples.scratch_file.write_bytes Examples.scratch_file Existing_File_Behavior.Append
Vector.write_bytes : (File|Text) -> Existing_File_Behavior -> File ! Illegal_Argument | File_Error
Vector.write_bytes self path on_existing_file=Existing_File_Behavior.Backup =
Vector.write_bytes : File -> Existing_File_Behavior -> File ! Illegal_Argument | File_Error
Vector.write_bytes self path:File on_existing_file=Existing_File_Behavior.Backup =
Panic.catch Unsupported_Argument_Types handler=(_ -> Error.throw (Illegal_Argument.Error "Only Vectors consisting of bytes (integers in the range from -128 to 127) are supported by the `write_bytes` method.")) <|
## Convert to a byte array before writing - and fail early if there is any problem.
byte_array = Array_Utils.ensureByteArray self
file = File.new path
r = on_existing_file.write file stream->
r = on_existing_file.write path stream->
stream.write_bytes (Vector.from_polyglot_array byte_array)
r.if_not_error file
r.if_not_error path

View File

@ -0,0 +1,222 @@
import project.Any.Any
import project.Data.Array_Proxy.Array_Proxy
import project.Data.Numbers.Integer
import project.Data.Text.Encoding.Encoding
import project.Data.Text.Text
import project.Data.Vector.Vector
import project.Error.Error
import project.Errors.Common.Index_Out_Of_Bounds
import project.Errors.File_Error.File_Error
import project.Nothing.Nothing
import project.System.File.File
from project.Data.Boolean import Boolean, False, True
from project.Data.Range.Extensions import all
from project.Data.Text.Extensions import all
from project.Logging import all
polyglot java import org.enso.base.Array_Utils
polyglot java import org.enso.base.arrays.LongArrayList
polyglot java import org.enso.base.FileLineReader
polyglot java import java.io.File as Java_File
polyglot java import java.nio.charset.Charset
type File_By_Line
## Creates a new File_By_Line object.
Arguments
- file: The file to read.
- encoding: The encoding to use when reading the file (defaults to UTF 8).
- offset: The position within the file to read from (defaults to first byte).
new : File->Encoding->File_By_Line
new file:File encoding:Encoding=Encoding.utf_8 offset:Integer=0 =
create_row_map =
row_map = LongArrayList.new
row_map.add offset
File_By_Line.log_message "Created row map"
row_map
File_By_Line.Reader file encoding Nothing Nothing create_row_map
## PRIVATE
Creates a new File_By_Line object.
Arguments
- file: The file to read.
- encoding: The encoding to use when reading the file (defaults to UTF 8).
- limit_lines: The number of lines to read (defaults to all lines).
- filter_func: The filter to apply to each line (defaults to no filter).
- row_map: The row map to use (defaults to a new row map).
- file_end: The end of the file in bytes.
Reader file:File encoding:Encoding limit_lines:(Integer|Nothing) filter_func row_map file_end=file.size
## Reads a specific line from the file.
Arguments
- line: The line to read (0 indexed).
get : Integer->Text
get self line:Integer = if self.limit_lines.is_nothing.not && line>self.limit_lines then Error.throw (Index_Out_Of_Bounds.Error line self.limit_lines) else
read_line self line
## Reads the first line
first : Text
first self = self.get 0
## Reads the first line
second : Text
second self = self.get 1
## Counts the number of lines in the file.
count : Integer
count self =
end_at = if self.limit_lines.is_nothing then -1 else self.limit_lines
for_each_lines self 0 end_at Nothing
## We've added all the indexes to the row map including the last one so we need to subtract 1
As row_map can be shared if we have a limit return that.
if end_at == -1 then self.row_map.getSize-1 else end_at.min self.row_map.getSize-1
## Returns the lines in the file as a vector.
to_vector : Vector Text
to_vector self = File_Error.handle_java_exceptions self.file <|
end_at = if self.limit_lines.is_nothing then Nothing else self.limit_lines-1
FileLineReader.readLines self.java_file self.file_end self.row_map 0 end_at self.charset self.filter_func
## Performs an action on each line.
Arguments
- function: The action to perform on each line.
each : (Text -> Any) -> Nothing
each self function =
new_function _ t = function t
self.each_with_index new_function
## Performs an action on each line.
Arguments:
- function: A function to apply that takes an index and an item.
The function is called with both the element index as well as the
element itself.
each_with_index : (Integer -> Text -> Any) -> Nothing
each_with_index self function =
end_at = if self.limit_lines.is_nothing then Nothing else self.limit_lines-1
for_each_lines self 0 end_at function
## Transforms each line in the file and returns the result as a vector.
Arguments
- action: The action to perform on each line.
map : (Text -> Any) -> Vector Any
map self action =
builder = Vector.new_builder
wrapped_action _ t = builder.append (action t)
self.each_with_index wrapped_action
builder.to_vector
## Transforms each line in the file and returns the result as a vector.
Arguments
- action: The action to perform on each line.
map_with_index : (Integer -> Text -> Any) -> Vector Any
map_with_index self action =
builder = Vector.new_builder
wrapped_action i t = builder.append (action i t)
self.each_with_index wrapped_action
builder.to_vector
## Skips the specified number of lines.
Arguments
- lines: The number of lines to skip.
skip : Integer -> File_By_Line
skip self lines:Integer =
## Read the line
create_row_map parent line =
if parent.row_map.getSize <= line then for_each_lines self 0 line Nothing
position = parent.row_map.getOrLast line
row_map = LongArrayList.new
row_map.add position
parent.log_message "Created Skipped Row Map ("+line.to_text+")"
row_map
new_limit = if self.limit_lines.is_nothing then lines else lines.min self.limit_lines
File_By_Line.Reader self.file self.encoding new_limit self.filter_func (create_row_map self lines) self.file_end
## Limits a file to a specific number of lines.
Arguments
- lines: The number of lines to limit the file to.
limit : Integer -> File_By_Line
limit self lines:Integer =
File_By_Line.Reader self.file self.encoding lines self.filter_func self.row_map self.file_end
## Filters the file by a predicate.
Arguments
- predicate: The predicate to filter by.
filter : Text | (Text -> Boolean) -> File_By_Line
filter self predicate =
## Create the predicate
new_filter = case predicate of
_:Text -> FileLineReader.createContainsFilter predicate self.charset
_ -> FileLineReader.wrapBooleanFilter predicate self.charset
## Find the index of the first line matching the new index.
make_filter_map parent new_filter =
end_at = if parent.limit_lines.is_nothing then -1 else parent.limit_lines-1
first_index = FileLineReader.findFirstNewFilter parent.java_file parent.file_end parent.row_map end_at parent.charset parent.filter_func new_filter
new_row_map = LongArrayList.new
new_row_map.add first_index
parent.log_message "Found Filter Start - "+first_index.to_text
new_row_map
## Merge the two predicates together.
new_predicate = if self.filter_func.is_nothing then new_filter else
FileLineReader.mergeTwoFilters self.filter_func new_filter
## If the parent is limited need to limit the child by end position in file.
if self.limit_lines == Nothing then File_By_Line.Reader self.file self.encoding Nothing new_predicate (make_filter_map self new_filter) self.file_end else
## Find the index of the last line matching the new index.
index_of parent line =
file_len = if parent.row_map.getSize > line then parent.row_map.get line else
for_each_lines self 0 line Nothing
parent.row_map.get parent.row_map.getSize-1
parent.log_message "Created File End ("+line.to_text+") - "+file_len.to_text
file_len
File_By_Line.Reader self.file self.encoding Nothing new_predicate (make_filter_map self new_filter) (index_of self self.limit_lines)
## ADVANCED
Exports the row_map
row_positions : Vector Integer
row_positions self = Vector.from_polyglot_array <|
Array_Proxy.new self.row_map.getSize (i-> self.row_map.get i)
## PRIVATE
Gets the Java_File for the backing file.
java_file : Java_File
java_file self = Java_File.new self.file.path
## PRIVATE
Gets the encoding as a Java Charset.
charset : Charset
charset self = self.encoding.to_java_charset
## PRIVATE
Reads a specific line from the file.
read_line : File_By_Line->Integer->Any->Any
read_line file:File_By_Line line:Integer=0 ~default=Nothing = File_Error.handle_java_exceptions file.file <|
FileLineReader.readSingleLine file.java_file file.file_end file.row_map line file.charset file.filter_func . if_nothing default
## PRIVATE
Performs an action on each line in the file.
for_each_lines : File_By_Line->Integer->(Integer|Nothing)->Any->Any
for_each_lines file:File_By_Line start_at:Integer end_at:(Integer|Nothing) action = File_Error.handle_java_exceptions file.file <|
java_file = file.java_file
row_map = file.row_map
file_end = file.file_end
charset = file.charset
## First if we haven't read the found the start_at line we need to find that.
if start_at >= row_map.getSize then FileLineReader.readSingleLine java_file file_end row_map start_at charset file.filter_func
## Now we can read the lines we need.
if row_map.getOrLast start_at >= file_end then Error.throw (Index_Out_Of_Bounds.Error start_at row_map.getSize) else
FileLineReader.forEachLine java_file file_end row_map start_at (end_at.if_nothing -1) charset file.filter_func action

View File

@ -7,7 +7,7 @@ type Client_Certificate
- cert_file: path to the client certificate file.
- key_file: path to the client key file.
- key_password: password for the client key file.
Value cert_file:(File|Text) key_file:(File|Text) (key_password:Text='')
Value cert_file:File key_file:File (key_password:Text='')
## PRIVATE
Creates the JDBC properties for the client certificate.
@ -18,5 +18,5 @@ type Client_Certificate
- sslpass: password for the client key file.
properties : Vector
properties self =
base = [Pair.new 'sslcert' (File.new self.cert_file).absolute.path, Pair.new 'sslkey' (File.new self.key_file).absolute.path]
base = [Pair.new 'sslcert' self.cert_file.absolute.path, Pair.new 'sslkey' self.key_file.absolute.path]
if self.key_password == "" then base else base + [Pair.new 'sslpassword' self.key_password]

View File

@ -8,7 +8,7 @@ type SQLite_Details
Arguments:
- location: Location of the SQLite database to connect to.
SQLite (location:(In_Memory|File|Text))
SQLite (location:(In_Memory|File))
## PRIVATE
Build the Connection resource.
@ -25,7 +25,7 @@ type SQLite_Details
jdbc_url : Text
jdbc_url self = case self.location of
In_Memory -> "jdbc:sqlite::memory:"
_ -> "jdbc:sqlite:" + ((File.new self.location).absolute.path.replace '\\' '/')
_ -> "jdbc:sqlite:" + (self.location.absolute.path.replace '\\' '/')
## PRIVATE
Provides the properties for the connection.

View File

@ -12,8 +12,8 @@ type SSL_Mode
## Will use SSL, validating the certificate but not verifying the hostname.
If `ca_file` is `Nothing`, the default CA certificate store will be used.
Verify_CA ca_file:Nothing|File|Text=Nothing
Verify_CA ca_file:Nothing|File=Nothing
## Will use SSL, validating the certificate and checking the hostname matches.
If `ca_file` is `Nothing`, the default CA certificate store will be used.
Full_Verification ca_file:Nothing|File|Text=Nothing
Full_Verification ca_file:Nothing|File=Nothing

View File

@ -1,4 +1,5 @@
from Standard.Base import all
import Standard.Base.Errors.Common.Index_Out_Of_Bounds
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
import Standard.Base.Errors.Illegal_State.Illegal_State
import Standard.Base.Internal.Rounding_Helpers
@ -112,6 +113,44 @@ type Column
to_vector : Vector Any
to_vector self = self.read max_rows=Nothing . to_vector
## GROUP Standard.Base.Selections
ICON select_row
Returns the value contained in this column at the given index.
Arguments:
- index: The index in the column from which to get the value.
If the value is an NA then this method returns nothing. If the index is
not an index in the column it returns an `Index_Out_Of_Bounds`.
> Example
Get the first element from a column.
import Standard.Examples
example_at = Examples.integer_column.at 0
at : Integer -> (Any | Nothing) ! Index_Out_Of_Bounds
at self (index : Integer) =
self.get index (Error.throw (Index_Out_Of_Bounds.Error index self.length))
## GROUP Standard.Base.Selections
ICON select_row
Returns the value contained in this column at the given index.
Arguments:
- index: The index in the column from which to get the value.
- default: The value if the index is out of range.
> Example
Get the first element from a column.
import Standard.Examples
example_at = Examples.integer_column.get 0 -1
get : Integer -> (Any | Nothing)
get self (index : Integer) (~default=Nothing) =
self.read index+1 . get index default
## GROUP Standard.Base.Metadata
Returns the `Value_Type` associated with that column.

View File

@ -519,7 +519,7 @@ type Table
if Helpers.check_integrity self column then column else
Panic.throw (Integrity_Error.Error "Column "+column.name)
## ALIAS filter rows
## ALIAS filter rows, where
GROUP Standard.Base.Selections
ICON preparation
@ -2516,8 +2516,8 @@ type Table
table = connection.query (SQL_Query.Table_Name "Table")
table.write (enso_project.data / "example_csv_output.csv")
@format Widget_Helpers.write_table_selector
write : File|Text -> File_Format -> Existing_File_Behavior -> Match_Columns -> Problem_Behavior -> Nothing ! Column_Count_Mismatch | Illegal_Argument | File_Error
write self path format=Auto_Detect on_existing_file=Existing_File_Behavior.Backup match_columns=Match_Columns.By_Name on_problems=Report_Warning =
write : File -> File_Format -> Existing_File_Behavior -> Match_Columns -> Problem_Behavior -> Nothing ! Column_Count_Mismatch | Illegal_Argument | File_Error
write self path:File format=Auto_Detect on_existing_file=Existing_File_Behavior.Backup match_columns=Match_Columns.By_Name on_problems=Report_Warning =
# TODO This should ideally be done in a streaming manner, or at least respect the row limits.
self.read.write path format on_existing_file match_columns on_problems

View File

@ -24,9 +24,9 @@ polyglot java import java.sql.DatabaseMetaData
polyglot java import java.sql.PreparedStatement
polyglot java import java.sql.SQLException
polyglot java import java.sql.SQLTimeoutException
polyglot java import org.graalvm.collections.Pair as Java_Pair
polyglot java import org.enso.database.dryrun.OperationSynchronizer
polyglot java import org.enso.database.JDBCProxy
polyglot java import org.graalvm.collections.Pair as Java_Pair
type JDBC_Connection
## PRIVATE

View File

@ -2056,8 +2056,26 @@ type Column
example_at = Examples.integer_column.at 0
at : Integer -> (Any | Nothing) ! Index_Out_Of_Bounds
at self (index : Integer) =
self.get index (Error.throw (Index_Out_Of_Bounds.Error index self.length))
## GROUP Standard.Base.Selections
ICON select_row
Returns the value contained in this column at the given index.
Arguments:
- index: The index in the column from which to get the value.
- default: The value if the index is out of range.
> Example
Get the first element from a column.
import Standard.Examples
example_at = Examples.integer_column.get 0 -1
get : Integer -> (Any | Nothing)
get self (index : Integer) (~default=Nothing) =
valid_index = (index >= 0) && (index < self.length)
if valid_index.not then Error.throw (Index_Out_Of_Bounds.Error index self.length) else
if valid_index.not then default else
storage = self.java_column.getStorage
if storage.isNa index then Nothing else
storage.getItem index

View File

@ -1,5 +1,6 @@
from Standard.Base import all
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
import Standard.Base.Data.Java_Json.Jackson_Object
## PRIVATE
A special type describing how to convert an object into a set of table
@ -15,6 +16,10 @@ type Convertible_To_Columns
Convertible_To_Columns.from (that:JS_Object) =
Convertible_To_Columns.Value that.field_names that.get
## PRIVATE
Convertible_To_Columns.from (that:Jackson_Object) =
Convertible_To_Columns.Value that.field_names that.get
## PRIVATE
Convertible_To_Columns.from (that:Map) =
pairs = that.keys.map k-> [k.to_text, k]
@ -23,6 +28,24 @@ Convertible_To_Columns.from (that:Map) =
Error.throw (Illegal_Argument.Error "Cannot convert "+that.to_display_text+" to a set of columns, because its keys are duplicated when converted to text.")
Convertible_To_Columns.Value field_map.keys (k-> that.get (field_map.get k))
## PRIVATE
Convertible_To_Columns.from (that:Pair) =
Convertible_To_Columns.Value ["0", "1"] (k-> if k == "0" then that.first else that.second)
## PRIVATE
Convertible_To_Columns.from (that:Vector) =
fields = 0.up_to that.length . map _.to_text
Convertible_To_Columns.Value fields (k-> that.at (Integer.parse k))
## PRIVATE
Convertible_To_Columns.from (that:Array) = Convertible_To_Columns.from that.to_vector
## PRIVATE
Convertible_To_Columns.from (that:Range) = Convertible_To_Columns.from that.to_vector
## PRIVATE
Convertible_To_Columns.from (that:Date_Range) = Convertible_To_Columns.from that.to_vector
## PRIVATE
Convertible_To_Columns.from (that:Any) =
name = "Value"

View File

@ -1,4 +1,7 @@
from Standard.Base import all
import Standard.Base.Data.Java_Json.Jackson_Object
import project.Data.Conversions.Convertible_To_Columns.Convertible_To_Columns
## PRIVATE
A special type that is used to define what types can be converted to a table
@ -14,7 +17,9 @@ type Convertible_To_Rows
Arguments:
- length: The number of rows in the table.
- getter: Get the value for a specified row.
Value length:Integer (getter : Integer->Any)
- columns: The names for the columns when object is expanded.
These will be added to the name of the input column.
Value length:Integer (getter : Integer->Any) (columns:Vector=["Value"])
## PRIVATE
Return the iterator values as a `Vector`.
@ -39,6 +44,51 @@ Convertible_To_Rows.from that:Pair = Convertible_To_Rows.Value that.length that.
## PRIVATE
Convertible_To_Rows.from that:Date_Range = Convertible_To_Rows.Value that.length that.get
## PRIVATE
Convertible_To_Rows.from that:Map =
vals = that.to_vector.map p-> Key_Value.Pair p.first p.second
Convertible_To_Rows.Value vals.length vals.get ["Key", "Value"]
## PRIVATE
Convertible_To_Rows.from that:JS_Object =
vals = that.map_with_key k->v-> Key_Value.Pair k v
Convertible_To_Rows.Value vals.length vals.get ["Key", "Value"]
## PRIVATE
Convertible_To_Rows.from that:Jackson_Object =
vals = that.map_with_key k->v-> Key_Value.Pair k v
Convertible_To_Rows.Value vals.length vals.get ["Key", "Value"]
## PRIVATE
Convertible_To_Rows.from (that:Any) =
Convertible_To_Rows.Value 1 (n-> if n==0 then that else Nothing)
## PRIVATE
type Key_Value
## PRIVATE
Arguments:
- key: The key of the pair.
- value: The value of the pair.
Pair key:Any value:Any
## PRIVATE
at self idx = self.get idx
## PRIVATE
Return the key of the pair.
get self idx = case idx of
0 -> self.key
1 -> self.value
"Key" -> self.key
"Value" -> self.value
_ -> Nothing
## PRIVATE
is_empty self = False
## PRIVATE
length self = 2
## PRIVATE
Convertible_To_Columns.from (that:Key_Value) =
Convertible_To_Columns.Value ["Key", "Value"] (k-> if k == "Key" then that.key else that.value)

View File

@ -1347,7 +1347,7 @@ type Table
expand_to_rows self column at_least_one_row=False =
Expand_Objects_Helpers.expand_to_rows self column at_least_one_row
## ALIAS filter rows
## ALIAS filter rows, where
GROUP Standard.Base.Selections
ICON preparation
@ -2487,14 +2487,13 @@ type Table
example_to_xlsx = Examples.inventory_table.write (enso_project.data / "example_xlsx_output.xlsx") Excel
@format Widget_Helpers.write_table_selector
write : File|Text -> File_Format -> Existing_File_Behavior -> Match_Columns -> Problem_Behavior -> File ! Column_Count_Mismatch | Illegal_Argument | File_Error
write self path format=Auto_Detect on_existing_file=Existing_File_Behavior.Backup match_columns=Match_Columns.By_Name on_problems=Report_Warning =
file = File.new path
write : File -> File_Format -> Existing_File_Behavior -> Match_Columns -> Problem_Behavior -> File ! Column_Count_Mismatch | Illegal_Argument | File_Error
write self path:File format=Auto_Detect on_existing_file=Existing_File_Behavior.Backup match_columns=Match_Columns.By_Name on_problems=Report_Warning =
case format of
_ : Auto_Detect ->
base_format = format.get_writing_format file
if base_format == Nothing then Error.throw (File_Error.Unsupported_Output_Type file Table) else
self.write file format=base_format on_existing_file match_columns on_problems
base_format = format.get_writing_format path
if base_format == Nothing then Error.throw (File_Error.Unsupported_Output_Type path Table) else
self.write path format=base_format on_existing_file match_columns on_problems
_ ->
handle_no_write_method caught_panic =
is_write = caught_panic.payload.method_name == "write_table"
@ -2502,7 +2501,7 @@ type Table
Error.throw (File_Error.Unsupported_Output_Type format Table)
Panic.catch No_Such_Method handler=handle_no_write_method <|
to_write = if Context.Output.is_enabled then self else self.take 1000
format.write_table file to_write on_existing_file match_columns on_problems
format.write_table path to_write on_existing_file match_columns on_problems
## Creates a text representation of the table using the CSV format.
to_csv : Text

View File

@ -33,22 +33,19 @@ type Excel_Workbook
- file: The file to load.
- xls_format: Whether to use the old XLS format (default is XLSX).
- headers: Whether to use the first row as headers (default is to infer).
new : File | Text | Temporary_File -> Boolean -> Boolean | Infer -> Excel_Workbook
new file xls_format=False headers=Infer =
file_obj = case file of
tmp : Temporary_File -> tmp
other -> File.new other
file_for_errors = if file_obj.is_a Temporary_File then Nothing else file_obj
new : File | Temporary_File -> Boolean -> Boolean | Infer -> Excel_Workbook
new file:(File|Temporary_File) xls_format=False headers=Infer =
file_for_errors = if file.is_a Temporary_File then Nothing else file
continuation raw_file =
format = if xls_format then ExcelFileFormat.XLS else ExcelFileFormat.XLSX
File_Error.handle_java_exceptions raw_file <| Excel_Reader.handle_bad_format file_for_errors <| Illegal_State.handle_java_exception <|
# The `java_file` depends on the liveness of the possible `Temporary_File` but that is ensured by storing the `file_obj` in the resulting workbook instance.
# The `java_file` depends on the liveness of the possible `Temporary_File` but that is ensured by storing the `file` in the resulting workbook instance.
java_file = Java_File.new raw_file.absolute.normalize.path
excel_connection_resource = Managed_Resource.register (ExcelConnectionPool.INSTANCE.openReadOnlyConnection java_file format) close_connection
Excel_Workbook.Value (Ref.new excel_connection_resource) file_obj xls_format headers
Excel_Workbook.Value (Ref.new excel_connection_resource) file xls_format headers
case file_obj of
case file of
tmp : Temporary_File -> tmp.with_file continuation
f : File -> continuation f

View File

@ -28,9 +28,13 @@ expand_column (table : Table) (column : Text | Integer) (fields : (Vector Text)
Prefix_Name.None -> ""
Prefix_Name.Column_Name -> column_object.name+" "
Prefix_Name.Custom value -> value
default_name = case prefix of
Prefix_Name.None -> "Value"
Prefix_Name.Column_Name -> column_object.name
Prefix_Name.Custom value -> value
naming_strategy = table.column_naming_helper.create_unique_name_strategy
naming_strategy.mark_used (table.column_names.filter (c->c!=column_object.name))
new_names = naming_strategy.make_all_unique (expanded.column_names.map n-> resolved_prefix+n)
new_names = naming_strategy.make_all_unique (expanded.column_names.map n-> if n=='Value' then default_name else resolved_prefix+n)
new_columns = new_names.zip expanded.columns (n->c-> c.rename n)
## Create Merged Columns
@ -74,13 +78,16 @@ expand_column (table : Table) (column : Text | Integer) (fields : (Vector Text)
# => Table.new [["aaa", [1, 1, 2, 2]], ["bbb", [30, 31, 40, 41]]]
@column Widget_Helpers.make_column_name_selector
expand_to_rows : Table -> Text | Integer -> Boolean -> Table ! Type_Error | No_Such_Column | Index_Out_Of_Bounds
expand_to_rows table column at_least_one_row=False =
expand_to_rows table column:(Text|Integer) at_least_one_row=False = if column.is_a Integer then expand_to_rows table (table.at column).name at_least_one_row else
row_expander : Any -> Vector
row_expander value:Convertible_To_Rows = value.to_vector
column_names : Any -> Vector
column_names value:Convertible_To_Rows = value.columns.map name-> if name=="Value" then column else column+" "+name
Java_Problems.with_problem_aggregator Problem_Behavior.Report_Warning java_problem_aggregator->
builder size = make_inferred_builder size java_problem_aggregator
Fan_Out.fan_out_to_rows table column row_expander at_least_one_row column_builder=builder
Fan_Out.fan_out_to_rows table column row_expander column_names at_least_one_row column_builder=builder
## PRIVATE
create_table_from_objects : Convertible_To_Rows -> (Vector Text | Nothing) -> Table
@ -130,6 +137,6 @@ create_table_from_objects (value : Convertible_To_Rows) (fields : Vector | Nothi
columns = case preset_fields of
True -> fields.distinct.map column_map.get
False ->
if discovered_field_names.is_empty then Error.throw (Illegal_Argument.Error "Unable to discover expected column names, because all input objects had no fields. Specify fields explicitly if you need a constant set of expected columns.") else
if discovered_field_names.is_empty then Error.throw (Illegal_Argument.Error "Unable to generate column names as all inputs had no fields.") else
discovered_field_names.to_vector.map column_map.get
Table.new columns

View File

@ -1,6 +1,8 @@
from Standard.Base import all
import Standard.Base.Runtime.Ref.Ref
import project.Data.Column.Column
import project.Data.Conversions.Convertible_To_Rows.Key_Value
import project.Data.Table.Table
import project.Data.Type.Value_Type.Value_Type
import project.Internal.Problem_Builder.Problem_Builder
@ -40,16 +42,20 @@ fan_out_to_columns table input_column_id function column_count=Nothing column_bu
- input_column: The column to transform.
- function: A function that transforms a single element of `input_column`
to multiple values.
- column_names: The names for the generated columns or a call back to create
the names for each row.
- at_least_one_row: When true, if the function returns an empty list, a
single row is output with `Nothing` for the transformed column. If false,
the row is not output at all.
fan_out_to_rows : Table -> Text | Integer -> (Any -> Vector Any) -> Boolean -> (Integer -> Any) -> Problem_Behavior -> Table
fan_out_to_rows table input_column_id function at_least_one_row=False column_builder=make_string_builder on_problems=Report_Error =
fan_out_to_rows : Table -> Text -> (Any -> Vector Any) -> Vector | Function -> Boolean -> (Integer -> Any) -> Problem_Behavior -> Table
fan_out_to_rows table input_column_id:Text function column_names=[input_column_id] at_least_one_row=False column_builder=make_string_builder on_problems=Report_Error =
## Treat this as a special case of fan_out_to_rows_and_columns, with one
column. Wrap the provided function to convert each value to a singleton
`Vector`.
wrapped_function x = function x . map y-> [y]
column_names = [input_column_id]
wrapped_function x = function x . map y-> case y of
_:Vector -> y
_:Key_Value -> y
_ -> [y]
fan_out_to_rows_and_columns table input_column_id wrapped_function column_names at_least_one_row=at_least_one_row column_builder=column_builder on_problems=on_problems
## PRIVATE
@ -99,63 +105,126 @@ fan_out_to_rows_and_columns table input_column_id function column_names at_least
input_column = table.at input_column_id
input_storage = input_column.java_column.getStorage
num_input_rows = input_storage.size
num_output_columns = column_names.length
# Guess that most of the time, we'll get at least one value for each input.
initial_size = input_column.length
# Accumulates the outputs of the function.
output_column_builders = Vector.new num_output_columns _-> column_builder initial_size
# Accumulates repeated position indices for the order mask.
order_mask_positions = Vector.new_builder initial_size
maybe_add_empty_row vecs =
should_add_empty_row = vecs.is_empty && at_least_one_row
if should_add_empty_row.not then vecs else
empty_row = Vector.fill num_output_columns Nothing
[empty_row]
0.up_to num_input_rows . each i->
input_value = input_storage.getItemBoxed i
output_values = function input_value |> maybe_add_empty_row
# Append each group of values to the builder.
output_values.each row_unchecked->
row = uniform_length num_output_columns row_unchecked problem_builder
row.each_with_index i-> v-> output_column_builders.at i . append v
# Append n copies of the input row position, n = # of output values.
repeat_each output_values.length <| order_mask_positions.append i
# Create the columns and a mask.
pair = if column_names.is_a Vector then fan_out_to_rows_and_columns_fixed input_storage function at_least_one_row column_names column_builder problem_builder else
fan_out_to_rows_and_columns_dynamic input_storage function at_least_one_row column_names column_builder problem_builder
raw_output_columns = pair.first
order_mask_positions = pair.second
# Reserve the non-input column names that will not be changing.
non_input_columns = table.columns.filter c-> c.name != input_column.name
unique.mark_used <| non_input_columns.map .name
# Build the output column
output_storages = output_column_builders.map .seal
output_columns = output_storages.map_with_index i-> output_storage->
column_name = unique.make_unique <| column_names.at i
Column.from_storage column_name output_storage
# Make output columns unique.
output_columns = raw_output_columns.map column->
column_name = unique.make_unique column.name
column.rename column_name
# Build the order mask.
order_mask = OrderMask.fromArray (order_mask_positions.to_vector)
## Build the new table, replacing the input column with the new output
columns.
## Build the new table, replacing the input column with the new output columns.
new_columns_unflattened = table.columns.map column->
case column.name == input_column_id of
True ->
# Replace the input column with the output columns.
output_columns
False ->
# Build a new column from the old one with the mask
old_storage = column.java_column.getStorage
new_storage = old_storage.applyMask order_mask
[Column.from_storage column.name new_storage]
new_columns = new_columns_unflattened.flatten
new_table = Table.new new_columns
# Replace the input column with the output columns.
if column.name == input_column_id then output_columns else
# Build a new column from the old one with the mask
old_storage = column.java_column.getStorage
new_storage = old_storage.applyMask order_mask
[Column.from_storage column.name new_storage]
new_table = Table.new new_columns_unflattened.flatten
problem_builder.attach_problems_after on_problems new_table
## PRIVATE
Inner method for fan_out_to_rows_and_columns where the column names are fixed.
fan_out_to_rows_and_columns_fixed : Any -> (Any -> Vector (Vector Any)) -> Boolean -> Vector Text -> (Integer -> Any) -> Problem_Builder -> Vector
fan_out_to_rows_and_columns_fixed input_storage function at_least_one_row:Boolean column_names:Vector column_builder problem_builder =
num_output_columns = column_names.length
num_input_rows = input_storage.size
# Accumulates the outputs of the function.
output_column_builders = Vector.new num_output_columns _-> column_builder num_input_rows
# Accumulates repeated position indices for the order mask.
order_mask_positions = Vector.new_builder num_input_rows
empty_row = [Vector.fill num_output_columns Nothing]
maybe_add_empty_row vecs = if vecs.is_empty && at_least_one_row then empty_row else vecs
0.up_to num_input_rows . each i->
input_value = input_storage.getItemBoxed i
output_values = maybe_add_empty_row (function input_value)
output_values.each row_unchecked->
row = uniform_length num_output_columns row_unchecked problem_builder
row.each_with_index i-> v-> output_column_builders.at i . append v
# Append n copies of the input row position, n = # of output values.
repeat_each output_values.length <| order_mask_positions.append i
output_columns = column_names.map_with_index i->n->
Column.from_storage n (output_column_builders.at i . seal)
[output_columns, order_mask_positions]
## PRIVATE
Inner method for fan_out_to_rows_and_columns where the column names are determined by each row.
fan_out_to_rows_and_columns_dynamic : Any -> (Any -> Vector (Vector Any)) -> Boolean -> (Any -> Text) -> (Integer -> Any) -> Problem_Builder -> Vector
fan_out_to_rows_and_columns_dynamic input_storage function at_least_one_row column_names_for_row column_builder problem_builder =
# Accumulates the outputs of the function.
column_map = Ref.new Map.empty
output_column_builders = Vector.new_builder
# Guess that most of the time, we'll get at least one value for each input.
num_input_rows = input_storage.size
# Column Builder add function
add_column n current_length =
column_map.put (column_map.get.insert n output_column_builders.length)
builder = column_builder num_input_rows
builder.appendNulls current_length
output_column_builders.append builder
# Accumulates repeated position indices for the order mask.
order_mask_positions = Vector.new_builder num_input_rows
maybe_add_empty_row vecs = if (vecs.is_empty && at_least_one_row).not then vecs else
[Vector.fill output_column_builders.length Nothing]
0.up_to num_input_rows . each i->
input_value = input_storage.getItemBoxed i
output_values = maybe_add_empty_row (function input_value)
# get the column names for the row.
row_column_names = column_names_for_row input_value
# Add any missing columns.
row_column_names.each n->
if column_map.get.contains_key n . not then
add_column n order_mask_positions.length
# Append each group of values to the builder.
current_columns = column_map.get
output_values.each row_unchecked->
row = uniform_length row_column_names.length row_unchecked problem_builder
row_column_names.each_with_index i->n->
output_column_builders.at (current_columns.at n) . append (row.at i)
# Fill in values for any column not present
if row_column_names.length != output_column_builders.length then
current_columns.each_with_key k->i->
if row_column_names.contains k . not then
output_column_builders.at i . appendNulls output_values.length
# Append n copies of the input row position, n = # of output values.
repeat_each output_values.length <| order_mask_positions.append i
# Build the output column
output_columns = column_map.get.to_vector.sort on=_.second . map pair->
Column.from_storage pair.first (output_column_builders.at pair.second . seal)
[output_columns, order_mask_positions]
## PRIVATE
Map a multi-valued function over a column and return the results as set of

View File

@ -11,16 +11,16 @@ split_to_columns : Table -> Text | Integer -> Text -> Integer | Nothing -> Probl
split_to_columns table input_column_id delimiter="," column_count=Nothing on_problems=Report_Error =
column = table.at input_column_id
Value_Type.expect_text column <|
fan_out_to_columns table input_column_id (handle_nothing (_.split delimiter)) column_count on_problems=on_problems
fan_out_to_columns table column.name (handle_nothing (_.split delimiter)) column_count on_problems=on_problems
## PRIVATE
Splits a column of text into a set of new rows.
See `Table.split_to_rows`.
split_to_rows : Table -> Text | Integer -> Text -> Table
split_to_rows table input_column_id delimiter="," =
split_to_rows table input_column_id:(Text|Integer) delimiter="," =
column = table.at input_column_id
Value_Type.expect_text column
fan_out_to_rows table input_column_id (handle_nothing (_.split delimiter)) at_least_one_row=True
Value_Type.expect_text column <|
fan_out_to_rows table column.name (handle_nothing (_.split delimiter)) at_least_one_row=True
## PRIVATE
Tokenizes a column of text into a set of new columns using a regular
@ -29,8 +29,8 @@ split_to_rows table input_column_id delimiter="," =
tokenize_to_columns : Table -> Text | Integer -> Text -> Case_Sensitivity -> Integer | Nothing -> Problem_Behavior -> Table
tokenize_to_columns table input_column_id pattern case_sensitivity column_count on_problems =
column = table.at input_column_id
Value_Type.expect_text column
fan_out_to_columns table input_column_id (handle_nothing (_.tokenize pattern case_sensitivity)) column_count on_problems=on_problems
Value_Type.expect_text column <|
fan_out_to_columns table column.name (handle_nothing (_.tokenize pattern case_sensitivity)) column_count on_problems=on_problems
## PRIVATE
Tokenizes a column of text into a set of new rows using a regular
@ -39,8 +39,8 @@ tokenize_to_columns table input_column_id pattern case_sensitivity column_count
tokenize_to_rows : Table -> Text | Integer -> Text -> Case_Sensitivity -> Boolean -> Table
tokenize_to_rows table input_column_id pattern="." case_sensitivity=Case_Sensitivity.Sensitive at_least_one_row=False =
column = table.at input_column_id
Value_Type.expect_text column
fan_out_to_rows table input_column_id (handle_nothing (_.tokenize pattern case_sensitivity)) at_least_one_row=at_least_one_row
Value_Type.expect_text column <|
fan_out_to_rows table column.name (handle_nothing (_.tokenize pattern case_sensitivity)) at_least_one_row=at_least_one_row
## PRIVATE
Converts a Text column into new columns using a regular expression
@ -54,13 +54,14 @@ parse_to_columns table input_column_id (pattern:(Text | Regex)=".") case_sensiti
case_insensitive = case_sensitivity.is_case_insensitive_in_memory
Regex.compile pattern case_insensitive=case_insensitive
fun = handle_nothing (regex_parse_to_vectors regex)
column_names = regex_to_column_names regex input_column_id
column = table.at input_column_id
fun = handle_nothing (regex_parse_to_vectors regex)
column_names = regex_to_column_names regex column.name
new_table = Value_Type.expect_text column <|
fan_out_to_rows_and_columns table input_column_id fun column_names at_least_one_row=True on_problems=on_problems
fan_out_to_rows_and_columns table column.name fun column_names at_least_one_row=True on_problems=on_problems
if parse_values then new_table.parse on_problems=on_problems else new_table
## PRIVATE

View File

@ -118,7 +118,7 @@ make_filter_condition_selector table display=Display.Always =
builder.append (Option "Less Than" fqn+".Less" [["than", col_names]])
builder.append (Option "Less Than Or Equal" fqn+".Equal_Or_Less" [["than", col_names]])
builder.append (Option "Greater Than" fqn+".Greater" [["than", col_names]])
builder.append (Option "Greater Than Or Equal" fqn+".Greater_Or_Less" [["than", col_names]])
builder.append (Option "Greater Than Or Equal" fqn+".Equal_Or_Greater" [["than", col_names]])
builder.append (Option "Between" fqn+".Between" [["lower", col_names], ["upper", col_names]])
builder.append (Option "Equals Ignore Case" fqn+".Equal_Ignore_Case" [["to", col_names]])
builder.append (Option "Starts With" fqn+".Starts_With" [["prefix", col_names]])

View File

@ -20,4 +20,4 @@ file_uploading path =
- file_path: The path at which the file is being uploaded.
type File_Being_Uploaded
## PRIVATE
Value file_path
Value file_path:Text

View File

@ -43,31 +43,46 @@ case class GithubHeuristic(info: DependencyInformation, log: Logger) {
*/
def tryDownloadingAttachments(address: String): Seq[Attachment] =
try {
val homePage = url(address).cat.!!
val fileRegex = """<a .*? href="(.*?)".*?>(.*?)</a>""".r("href", "name")
val matches = fileRegex
.findAllMatchIn(homePage)
.map(m => (m.group("name"), m.group("href")))
.filter(p => mayBeRelevant(p._1))
.toList
matches.flatMap { case (_, href) =>
try {
val content =
url("https://github.com" + href.replace("blob", "raw")).cat.!!
Seq(
AttachedFile(
PortablePath.of(href),
content,
origin = Some("github.com")
)
)
} catch {
case NonFatal(error) =>
log.warn(
s"Found file $href but cannot download it: $error"
)
Seq()
}
val homePage = url(address).cat.!!
val branchRegex = """"defaultBranch":"([^"]*?)"""".r("branch")
val branch = branchRegex.findFirstMatchIn(homePage).map(_.group("branch"))
branch match {
case None =>
log.warn(s"Cannot find default branch for $address")
Seq()
case Some(branch) =>
val fileRegex =
"""\{"name":"([^"]*?)","path":"([^"]*?)","contentType":"file"\}"""
.r("name", "path")
val matches = fileRegex
.findAllMatchIn(homePage)
.map(m => (m.group("name"), m.group("path")))
.filter(p => mayBeRelevant(p._1))
.toList
matches.flatMap { case (_, path) =>
val rawHref = address + "/raw/" + branch + "/" + path
// This path is reconstructed to match the 'legacy' format for compatibility with older versions of the review settings.
// It has the format <org>/<repo>/blob/<branch>/<path>
val internalPath = address
.stripPrefix("https://github.com")
.stripSuffix("/") + "/blob/" + branch + "/" + path
try {
val content = url(rawHref).cat.!!
Seq(
AttachedFile(
PortablePath.of(internalPath),
content,
origin = Some(address)
)
)
} catch {
case NonFatal(error) =>
log.warn(
s"Found file $rawHref but cannot download it: $error"
)
Seq()
}
}
}
} catch {
case NonFatal(error) =>

View File

@ -6,7 +6,11 @@ import java.io.OutputStream;
import java.nio.Buffer;
import java.nio.ByteBuffer;
import java.nio.CharBuffer;
import java.nio.charset.*;
import java.nio.charset.Charset;
import java.nio.charset.CharsetDecoder;
import java.nio.charset.CharsetEncoder;
import java.nio.charset.CoderResult;
import java.nio.charset.CodingErrorAction;
import java.util.Arrays;
import java.util.function.BiConsumer;
import java.util.function.Function;
@ -99,7 +103,7 @@ public class Encoding_Utils {
* @return the resulting string
*/
public static ResultWithWarnings<String> from_bytes(byte[] bytes, Charset charset) {
if (bytes.length == 0) {
if (bytes == null || bytes.length == 0) {
return new ResultWithWarnings<>("");
}

View File

@ -0,0 +1,417 @@
package org.enso.base;
import com.ibm.icu.text.Normalizer2;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.List;
import java.util.function.BiConsumer;
import java.util.function.Function;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.enso.base.arrays.LongArrayList;
import org.graalvm.polyglot.Context;
/** A reader for reading lines from a file one at a time. */
public class FileLineReader {
public static class ByteArrayOutputStreamWithContains extends ByteArrayOutputStream {
public ByteArrayOutputStreamWithContains(int size) {
super(size);
}
/** Creates a preloaded stream from a byte array. */
public static ByteArrayOutputStreamWithContains fromByteArray(byte[] bytes) {
var stream = new ByteArrayOutputStreamWithContains(0);
stream.buf = bytes;
stream.count = bytes.length;
return stream;
}
/**
* Computes the longest prefix for the given byte array. Based on <a
* href="https://www.geeksforgeeks.org/kmp-algorithm-for-pattern-searching/">Geeks for geeks</a>
*/
public static int[] computeLongestPrefix(byte[] bytes) {
int[] longestPrefix = new int[bytes.length];
int i = 1;
int len = 0;
while (i < bytes.length) {
if (bytes[i] == bytes[len]) {
len++;
longestPrefix[i++] = len;
} else if (len == 0) {
longestPrefix[i++] = 0;
} else {
len = longestPrefix[len - 1];
}
}
return longestPrefix;
}
/** Checks if the stream contains the given byte array. */
public boolean contains(byte[] bytes, int[] longestPrefix) {
// ToDo: Needs to deal with the Unicode scenario where the next character is a combining
// character. #8900
if (bytes.length > count) {
return false;
}
int i = 0;
int j = 0;
while ((count - i) >= (bytes.length - j)) {
if (buf[i] == bytes[j]) {
i++;
j++;
}
if (j == bytes.length) {
return true;
}
if (i < count && buf[i] != bytes[j]) {
if (j != 0) {
j = longestPrefix[j - 1];
} else {
i++;
}
}
}
return false;
}
}
private static class CancellationToken {
public boolean isCancelled = false;
public void cancel() {
isCancelled = true;
}
}
private static final Logger LOGGER = Logger.getLogger("enso-file-line-reader");
/** Amount of data to read at a time for a single line (4KB). */
private static final int LINE_BUFFER = 4 * 1024;
/** Amount of data to read at a time (4MB). */
private static final int BUFFER_SIZE = 4 * 1024 * 1024;
private static boolean moreToRead(int c, MappedByteBuffer buffer) {
return switch (c) {
case '\n', -1 -> false;
case '\r' -> {
c = buffer.hasRemaining() ? buffer.get() : '\n';
if (c != '\n') {
buffer.position(buffer.position() - 1);
}
yield false;
}
default -> true;
};
}
private static int readByte(MappedByteBuffer buffer) {
return buffer.hasRemaining() ? buffer.get() : -1;
}
/**
* Reads a line into an OutputStream. Returns true if the end of the line was found, false if the
* buffer finished.
*/
private static boolean readLine(MappedByteBuffer buffer, ByteArrayOutputStream result) {
int c = readByte(buffer);
while (moreToRead(c, buffer)) {
result.write(c);
c = readByte(buffer);
Context.getCurrent().safepoint();
}
return c != -1 && (c != '\r' || buffer.hasRemaining());
}
/**
* Scans forward one line. Returns true if the end of the line was found, false if the buffer
* finished.
*/
private static boolean scanLine(MappedByteBuffer buffer) {
int c = readByte(buffer);
while (moreToRead(c, buffer)) {
c = readByte(buffer);
Context.getCurrent().safepoint();
}
return c != -1 && (c != '\r' || buffer.hasRemaining());
}
/** Reads a line from a file at the given index using the existing rowMap. */
private static String readLineByIndex(
File file, long length, LongArrayList rowMap, int index, Charset charset) throws IOException {
if (index >= rowMap.getSize()) {
throw new IndexOutOfBoundsException(index);
}
long position = rowMap.get(index);
if (position >= length) {
return null;
}
long toRead =
rowMap.getSize() > index + 1 ? rowMap.get(index + 1) - position : length - position;
// Output buffer
var outputStream = new ByteArrayOutputStream(128);
// Only read what we have to.
try (var stream = new FileInputStream(file)) {
var channel = stream.getChannel();
int bufferSize = (int) Math.min(LINE_BUFFER, toRead);
long remaining = toRead - bufferSize;
var buffer = channel.map(FileChannel.MapMode.READ_ONLY, position, bufferSize);
var result = readLine(buffer, outputStream);
while (!result && remaining > 0) {
position += bufferSize;
bufferSize = (int) Math.min(LINE_BUFFER, remaining);
remaining -= bufferSize;
buffer = channel.map(FileChannel.MapMode.READ_ONLY, position, bufferSize);
result = readLine(buffer, outputStream);
}
}
return outputStream.toString(charset);
}
/** Scans forward in a file and returns the line at the given index. */
public static String readSingleLine(
File file,
long length,
LongArrayList rowMap,
int index,
Charset charset,
Function<ByteArrayOutputStreamWithContains, String> filter)
throws IOException {
int size = rowMap.getSize();
if (index != -1 && size > index) {
return readLineByIndex(file, length, rowMap, index, charset);
}
// Start at the last known line and scan forward.
return forEachLine(file, length, rowMap, size - 1, index, charset, filter, null);
}
/** Scans forward in a file reading line by line. Returning all the matching lines. */
public static List<String> readLines(
File file,
long length,
LongArrayList rowMap,
int startAt,
int endAt,
Charset charset,
Function<ByteArrayOutputStreamWithContains, String> filter)
throws IOException {
List<String> result = new ArrayList<>();
forEachLine(
file, length, rowMap, startAt, endAt, charset, filter, (index, line) -> result.add(line));
return result;
}
/**
* Scans forward in a file reading line by line.
*
* @param file The file to read.
* @param rowMap The rowMap to use.
* @param startAt The index to start at.
* @param endAt The index to end at (inclusive).
* @param charset The charset to use.
* @param filter The filter to apply to each line.
* @param action The action to apply to each line (optional).
* @return The last line read or null if end of file is reached.
*/
public static String forEachLine(
File file,
long length,
LongArrayList rowMap,
int startAt,
int endAt,
Charset charset,
Function<ByteArrayOutputStreamWithContains, String> filter,
BiConsumer<Integer, String> action)
throws IOException {
return innerForEachLine(
file, length, rowMap, startAt, endAt, charset, filter, action, new CancellationToken());
}
private static String innerForEachLine(
File file,
long length,
LongArrayList rowMap,
int startAt,
int endAt,
Charset charset,
Function<ByteArrayOutputStreamWithContains, String> filter,
BiConsumer<Integer, String> action,
CancellationToken cancellationToken)
throws IOException {
if (startAt >= rowMap.getSize()) {
throw new IndexOutOfBoundsException(startAt);
}
int index = action == null ? rowMap.getSize() - 1 : startAt;
long position = rowMap.get(index);
if (position >= length) {
return null;
}
boolean readAll = filter != null || action != null || endAt == -1;
var outputStream = new ByteArrayOutputStreamWithContains(128);
String output = null;
try (var stream = new FileInputStream(file)) {
var channel = stream.getChannel();
var bufferSize = (int) Math.min(BUFFER_SIZE, (length - position));
var truncated = bufferSize != (length - position);
var buffer = channel.map(FileChannel.MapMode.READ_ONLY, position, bufferSize);
// Loop until we either reach the required record or run out of data.
while (!cancellationToken.isCancelled
&& (endAt == -1 || index <= endAt)
&& (truncated || buffer.hasRemaining())) {
var linePosition = buffer.position() + position;
// Read a line.
outputStream.reset();
boolean success =
(readAll || index == endAt) ? readLine(buffer, outputStream) : scanLine(buffer);
if (success || !truncated) {
String line = null;
if (filter == null || (line = filter.apply(outputStream)) != null) {
if (index >= rowMap.getSize()) {
rowMap.add(linePosition);
}
if (action != null) {
line = line == null ? outputStream.toString(charset) : line;
action.accept(index, line);
}
if (index == endAt) {
output = line == null ? outputStream.toString(charset) : line;
}
if (index % 100000 == 0) {
LOGGER.log(Level.INFO, "Scanned Lines: {0}", index);
}
index++;
// If no filter we can record the start of the next line.
if (filter == null && index == rowMap.getSize()) {
rowMap.add(buffer.position() + position);
}
}
// Fast-forward if needed
if (filter != null && index < rowMap.getSize()) {
int newPosition = Math.min(bufferSize, (int) (rowMap.get(index) - position));
buffer.position(newPosition);
}
} else {
// Read more if we need to
if (!buffer.hasRemaining()) {
position = linePosition;
bufferSize = (int) Math.min(BUFFER_SIZE, (length - position));
truncated = bufferSize != (length - position);
buffer = channel.map(FileChannel.MapMode.READ_ONLY, position, bufferSize);
}
}
}
if (!truncated && !buffer.hasRemaining() && rowMap.get(rowMap.getSize() - 1) != length) {
// Add the last line to mark reached the end.
rowMap.add(length);
}
return output;
}
}
/**
* Scans forward in a file reading line by line until it finds a line that matches the new filter.
*/
public static long findFirstNewFilter(
File file,
long length,
LongArrayList rowMap,
int endAt,
Charset charset,
Function<ByteArrayOutputStreamWithContains, String> filter,
Function<ByteArrayOutputStreamWithContains, String> newFilter)
throws IOException {
final CancellationToken token = new CancellationToken();
final List<Long> result = new ArrayList<>();
BiConsumer<Integer, String> action =
(index, line) -> {
var bytes = line.getBytes(charset);
var outputStream = ByteArrayOutputStreamWithContains.fromByteArray(bytes);
if (newFilter.apply(outputStream) != null) {
result.add(rowMap.get(index));
token.cancel();
}
};
innerForEachLine(file, length, rowMap, 0, endAt, charset, filter, action, token);
return result.isEmpty() ? rowMap.get(rowMap.getSize() - 1) : result.get(0);
}
/** Creates a filter that checks if the line contains the given string. */
public static Function<ByteArrayOutputStreamWithContains, String> createContainsFilter(
String contains, Charset charset) {
if (isUnicodeCharset(charset)) {
var nfcVersion = Normalizer2.getNFCInstance().normalize(contains);
var nfdVersion = Normalizer2.getNFDInstance().normalize(contains);
if (!nfcVersion.equals(nfdVersion)) {
// Need to use Unicode normalization for equality.
return (outputStream) -> {
var line = outputStream.toString(charset);
return Text_Utils.contains(contains, line) ? line : null;
};
}
}
var bytes = contains.getBytes(charset);
var prefixes = ByteArrayOutputStreamWithContains.computeLongestPrefix(bytes);
return (outputStream) ->
outputStream.contains(bytes, prefixes) ? outputStream.toString(charset) : null;
}
/** Wraps an Enso function filter in a FileLineReader filter. */
public static Function<ByteArrayOutputStreamWithContains, String> wrapBooleanFilter(
Function<String, Boolean> filter, Charset charset) {
return (outputStream) -> {
var line = outputStream.toString(charset);
return filter.apply(line) ? line : null;
};
}
/** Joins two filters together. */
public static Function<ByteArrayOutputStreamWithContains, String> mergeTwoFilters(
Function<ByteArrayOutputStreamWithContains, String> first,
Function<ByteArrayOutputStreamWithContains, String> second) {
return (outputStream) -> {
var first_result = first.apply(outputStream);
return first_result != null ? second.apply(outputStream) : null;
};
}
private static boolean isUnicodeCharset(Charset charset) {
return charset == StandardCharsets.UTF_8
|| charset == StandardCharsets.UTF_16
|| charset == StandardCharsets.UTF_16BE
|| charset == StandardCharsets.UTF_16LE;
}
}

View File

@ -0,0 +1,44 @@
package org.enso.base.arrays;
import java.util.Arrays;
/** A helper to efficiently build an array of unboxed longs of arbitrary length. */
public class LongArrayList {
private long[] backingStorage;
private int lastIndex = -1;
public LongArrayList() {
backingStorage = new long[32];
}
// ** Gets the number of elements in the list. */
public int getSize() {
return lastIndex + 1;
}
// ** Gets an element from the list. */
public long get(int index) {
if (index > lastIndex) {
throw new IndexOutOfBoundsException(index);
}
return backingStorage[index];
}
// ** Gets an element from the list. */
public long getOrLast(int index) {
return backingStorage[Math.min(index, lastIndex)];
}
// ** Adds an element to the list. */
public void add(long x) {
int index;
index = lastIndex + 1;
if (index >= backingStorage.length) {
backingStorage = Arrays.copyOf(backingStorage, backingStorage.length * 2);
}
backingStorage[index] = x;
lastIndex = index;
}
}

View File

@ -44,6 +44,32 @@ add_specs suite_builder =
(Date_Time.new 2022 12 12).should_equal (Date_Time.new 2022 12 12)
(Date_Time.new 2022 12 12).should_not_equal (Date_Time.new 1996)
suite_builder.group "Unix epoch conversion" group_builder->
group_builder.specify "should allow creating from second and nanosecond" <|
Date_Time.from_unix_epoch_seconds 0 . should_equal (Date_Time.new 1970 1 1 zone=Time_Zone.utc)
Date_Time.from_unix_epoch_seconds 1 . should_equal (Date_Time.new 1970 1 1 0 0 1 zone=Time_Zone.utc)
Date_Time.from_unix_epoch_seconds 1 123456789 . should_equal (Date_Time.new 1970 1 1 0 0 1 nanosecond=123456789 zone=Time_Zone.utc)
Date_Time.from_unix_epoch_seconds 1704371744 . should_equal (Date_Time.new 2024 1 4 12 35 44 zone=Time_Zone.utc)
Date_Time.from_unix_epoch_seconds 1704371744 123456789 . should_equal (Date_Time.new 2024 1 4 12 35 44 nanosecond=123456789 zone=Time_Zone.utc)
group_builder.specify "should allow creating from milliseconds" <|
Date_Time.from_unix_epoch_milliseconds 0 . should_equal (Date_Time.new 1970 1 1 zone=Time_Zone.utc)
Date_Time.from_unix_epoch_milliseconds 1 . should_equal (Date_Time.new 1970 1 1 0 0 0 millisecond=1 zone=Time_Zone.utc)
Date_Time.from_unix_epoch_milliseconds 123 . should_equal (Date_Time.new 1970 1 1 0 0 0 millisecond=123 zone=Time_Zone.utc)
Date_Time.from_unix_epoch_milliseconds 1704371744123 . should_equal (Date_Time.new 2024 1 4 12 35 44 millisecond=123 zone=Time_Zone.utc)
group_builder.specify "should allow convert to epoch seconds" <|
Date_Time.new 1970 zone=Time_Zone.utc . to_unix_epoch_seconds . should_equal 0
Date_Time.new 1970 1 1 0 0 1 zone=Time_Zone.utc . to_unix_epoch_seconds . should_equal 1
Date_Time.new 1970 1 1 0 0 1 nanosecond=123456789 zone=Time_Zone.utc . to_unix_epoch_seconds . should_equal 1
Date_Time.new 2024 1 4 12 35 44 zone=Time_Zone.utc . to_unix_epoch_seconds . should_equal 1704371744
group_builder.specify "should allow convert to epoch milliseconds" <|
Date_Time.new 1970 zone=Time_Zone.utc . to_unix_epoch_milliseconds . should_equal 0
Date_Time.new 1970 1 1 0 0 0 millisecond=1 zone=Time_Zone.utc . to_unix_epoch_milliseconds . should_equal 1
Date_Time.new 1970 1 1 0 0 0 millisecond=1 microsecond=123 zone=Time_Zone.utc . to_unix_epoch_milliseconds . should_equal 1
Date_Time.new 2024 1 4 12 35 44 millisecond=123 zone=Time_Zone.utc . to_unix_epoch_milliseconds . should_equal 1704371744123
spec_with suite_builder name create_new_datetime parse_datetime nanoseconds_loss_in_precision=False =
suite_builder.group name group_builder->

View File

@ -200,7 +200,7 @@ add_specs suite_builder setup =
case setup.test_selection.supports_mixed_columns of
False -> callback_with_clue (setup.table_builder table_structure)
True ->
all_combinations (Vector.fill table_structure.length [Nothing, Mixed_Type_Object.Value]) . each combination->
all_combinations (Vector.fill table_structure.length [Nothing, Mixed_Type_Object]) . each combination->
amended_table_structure = table_structure.zip combination column_definition-> prefix->
name = column_definition.first
values = column_definition.second
@ -1753,7 +1753,6 @@ add_specs suite_builder setup =
# A dummy value used to force the in-memory backend to trigger a infer a mixed type for the given column.
type Mixed_Type_Object
Value
all_combinations variables =
result = Vector.new_builder

View File

@ -1,4 +1,5 @@
from Standard.Base import all
import Standard.Base.Errors.Common.Index_Out_Of_Bounds
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
from Standard.Table import Table, Sort_Column
@ -170,7 +171,19 @@ add_specs (suite_builder : Suite_Builder) (prefix : Text) (create_connection_fn
group_builder.specify "should allow to materialize columns directly into a Vector" <|
v = data.t1.at 'a' . to_vector
v . should_equal [1, 4]
group_builder.specify "should allow getting specific elements" <|
test_column = data.t1.at 'a'
test_column.get 0 . should_equal 1
test_column.get 3 . should_equal Nothing
test_column.get 4 -1 . should_equal -1
group_builder.specify "should allow getting specific elements (with at)" <|
test_column = data.t1.at 'a'
test_column.at 0 . should_equal 1
test_column.at 1 . should_equal 4
test_column.at 3 . should_fail_with Index_Out_Of_Bounds
group_builder.specify "should handle bigger result sets" <|
data.big_table.read.row_count . should_equal data.big_size

View File

@ -26,6 +26,13 @@ add_specs suite_builder =
empty_column = Column.from_vector "Test" []
group_builder.specify "should allow getting specific elements" <|
test_column.get 0 . should_equal 1
test_column.get 2 . should_equal 5
test_column.get 5 . should_equal 6
test_column.get 6 . should_equal Nothing
empty_column.get 0 -1 . should_equal -1
group_builder.specify "should allow getting specific elements (with at)" <|
test_column.at 0 . should_equal 1
test_column.at 2 . should_equal 5
test_column.at 5 . should_equal 6

View File

@ -12,6 +12,7 @@ import project.In_Memory.Lossy_Conversions_Spec
import project.In_Memory.Parse_To_Table_Spec
import project.In_Memory.Split_Tokenize_Spec
import project.In_Memory.Table_Spec
import project.In_Memory.Table_Conversion_Spec
import project.In_Memory.Table_Date_Spec
import project.In_Memory.Table_Date_Time_Spec
import project.In_Memory.Table_Time_Of_Day_Spec
@ -23,6 +24,7 @@ add_specs suite_builder =
Common_Spec.add_specs suite_builder
Integer_Overflow_Spec.add_specs suite_builder
Lossy_Conversions_Spec.add_specs suite_builder
Table_Conversion_Spec.add_specs suite_builder
Table_Date_Spec.add_specs suite_builder
Table_Date_Time_Spec.add_specs suite_builder
Table_Time_Of_Day_Spec.add_specs suite_builder

View File

@ -16,6 +16,14 @@ add_specs suite_builder =
expected = Table.from_rows ["foo", "bar 1", "bar 2", "bar 3"] expected_rows
t2 = t.split_to_columns "bar" "|"
t2.should_equal expected
group_builder.specify "can do split_to_columns by index" <|
cols = [["foo", [0, 1, 2]], ["bar", ["a|c", "c|d|ef", "gh|ij|u"]]]
t = Table.new cols
expected_rows = [[0, "a", "c", Nothing], [1, "c", "d", "ef"], [2, "gh", "ij", "u"]]
expected = Table.from_rows ["foo", "bar 1", "bar 2", "bar 3"] expected_rows
t2 = t.split_to_columns 1 "|"
t2.should_equal expected
group_builder.specify "can do split_to_columns where split character, first, last and only character" <|
cols = [["foo", [0, 1, 2]], ["bar", ["|cb", "ab|", "|"]]]
@ -41,6 +49,14 @@ add_specs suite_builder =
t2 = t.split_to_rows "bar" "|"
t2.should_equal expected
group_builder.specify "can do split_to_rows by index" <|
cols = [["foo", [0, 1, 2]], ["bar", ["a|c", "c|d|ef", "gh|ij|u"]]]
t = Table.new cols
expected_rows = [[0, "a"], [0, "c"], [1, "c"], [1, "d"], [1, "ef"], [2, "gh"], [2, "ij"], [2, "u"]]
expected = Table.from_rows ["foo", "bar"] expected_rows
t2 = t.split_to_rows 1 "|"
t2.should_equal expected
group_builder.specify "can do split_to_rows where split character, first, last and only character" <|
cols = [["foo", [0, 1, 2]], ["bar", ["|cb", "ab|", "|"]]]
t = Table.new cols
@ -82,6 +98,14 @@ add_specs suite_builder =
t2 = t.tokenize_to_columns "bar" "\d+"
t2.should_equal expected
group_builder.specify "can do tokenize_to_columns by index" <|
cols = [["foo", [0, 1, 2]], ["bar", ["a12b34r5", "23", "2r4r55"]]]
t = Table.new cols
expected_rows = [[0, "12", "34", "5"], [1, "23", Nothing, Nothing], [2, "2", "4", "55"]]
expected = Table.from_rows ["foo", "bar 1", "bar 2", "bar 3"] expected_rows
t2 = t.tokenize_to_columns 1 "\d+"
t2.should_equal expected
group_builder.specify "can do tokenize_to_rows" <|
cols = [["foo", [0, 1, 2]], ["bar", ["a12b34r5", "23", "2r4r55"]]]
t = Table.new cols
@ -90,6 +114,14 @@ add_specs suite_builder =
t2 = t.tokenize_to_rows "bar" "\d+"
t2.should_equal expected
group_builder.specify "can do tokenize_to_rows by index" <|
cols = [["foo", [0, 1, 2]], ["bar", ["a12b34r5", "23", "2r4r55"]]]
t = Table.new cols
expected_rows = [[0, "12"], [0, "34"], [0, "5"], [1, "23"], [2, "2"], [2, "4"], [2, "55"]]
expected = Table.from_rows ["foo", "bar"] expected_rows
t2 = t.tokenize_to_rows 1 "\d+"
t2.should_equal expected
group_builder.specify "can do tokenize_to_columns with some nothings" <|
cols = [["foo", [0, 1, 2, 3]], ["bar", ["a12b34r5", Nothing, "23", "2r4r55"]]]
t = Table.new cols
@ -269,6 +301,12 @@ add_specs suite_builder =
actual = t.parse_to_columns "bar" "(\d)(\d)"
actual.should_equal expected
group_builder.specify "can parse to columns by index" <|
t = Table.from_rows ["foo", "bar", "baz"] [["x", "12 34p q56", "y"], ["xx", "a48 59b", "yy"]]
expected = Table.from_rows ["foo", "bar 1", "bar 2", "baz"] [["x", 1, 2, "y"], ["x", 3, 4, "y"], ["x", 5, 6, "y"], ["xx", 4, 8, "yy"], ["xx", 5, 9, "yy"]]
actual = t.parse_to_columns 1 "(\d)(\d)"
actual.should_equal expected
group_builder.specify "no regex groups" <|
t = Table.from_rows ["foo", "bar", "baz"] [["x", "12 34p q56", "y"], ["xx", "a48 59b", "yy"]]
expected = Table.from_rows ["foo", "bar", "baz"] [["x", 12, "y"], ["x", 34, "y"], ["x", 56, "y"], ["xx", 48, "yy"], ["xx", 59, "yy"]]

View File

@ -77,12 +77,12 @@ add_specs suite_builder =
suite_builder.group "from_objects with JSON (single values)" group_builder->
group_builder.specify "Generates a single-row table from a JSON object" <|
expected = Table.from_rows ["first", "last", "age"] [["Mary", "Smith", 23]]
expected = Table.new [["Key", ["first", "last", "age"]], ["Value", ["Mary", "Smith", 23]]]
Table.from_objects (data.uniform_json.at 0) . should_equal expected
group_builder.specify "works fine even if requested fields are duplicated" <|
expected = Table.from_rows ["first", "last"] [["Mary", "Smith"]]
Table.from_objects (data.uniform_json.at 0) ["first", "last", "first", "first"] . should_equal expected
expected = Table.new [["Key", ["first", "last", "age"]], ["Value", ["Mary", "Smith", 23]]]
Table.from_objects (data.uniform_json.at 0) ["Key", "Value", "Key", "Key"] . should_equal expected
suite_builder.group "from_objects with uniform JSON vector" group_builder->
group_builder.specify "Generates a table from a vector of JSON objects" <|
@ -159,9 +159,19 @@ add_specs suite_builder =
suite_builder.group "expand_column" group_builder->
group_builder.specify "Expands a column of single values" <|
table = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
expected = Table.new [["aaa", [1, 2]], ["bbb Value", [3, 4]], ["ccc", [5, 6]]]
expected = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
table.expand_column "bbb" . should_equal expected
group_builder.specify "Expands a column of single values by index" <|
table = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
expected = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
table.expand_column 1 . should_equal expected
group_builder.specify "Expands a column of single values by index" <|
table = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
expected = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
table.expand_column 1 . should_equal expected
group_builder.specify "Expands a uniform column of JSON objects" <|
table = Table.new [["aaa", [1, 2]], ["bbb", data.uniform_json], ["ccc", [5, 6]]]
expected = Table.new [["aaa", [1, 2]], ["bbb first", ["Mary", "Joe"]], ["bbb last", ["Smith", "Burton"]], ["bbb age", [23, 34]], ["ccc", [5, 6]]]
@ -182,9 +192,19 @@ add_specs suite_builder =
expected = Table.new [["aaa", [1, 2]], ["bbb last", ["Smith", Nothing]], ["bbb height", [Nothing, 1.9]], ["bbb foo", [Nothing, Nothing]], ["ccc", [5, 6]]]
table.expand_column "bbb" ["last", "height", "foo"] . should_equal expected
group_builder.specify "accept vectors/arrays within a column" <|
group_builder.specify "Expands vectors/arrays within a column" <|
table = Table.new [["aaa", [1, 2]], ["bbb", [[1, 2, 3], [4, 5, 6].to_array]]]
expected = Table.new [["aaa", [1, 2]], ["bbb Value", [[1, 2, 3], [4, 5, 6].to_array]]]
expected = Table.new [["aaa", [1, 2]], ["bbb 0", [1, 4]], ["bbb 1", [2, 5]], ["bbb 2", [3, 6]]]
table.expand_column "bbb" . should_equal expected
group_builder.specify "Expands ranges within a column" <|
table = Table.new [["aaa", [1, 2]], ["bbb", [0.up_to 2, 3.up_to 5]]]
expected = Table.new [["aaa", [1, 2]], ["bbb 0", [0, 3]], ["bbb 1", [1, 4]]]
table.expand_column "bbb" . should_equal expected
group_builder.specify "Expands date ranges within a column" <|
table = Table.new [["aaa", [1, 2]], ["bbb", [Date.new 2020 12 1 . up_to (Date.new 2020 12 3), Date.new 2022 12 1 . up_to (Date.new 2022 12 2)]]]
expected = Table.new [["aaa", [1, 2]], ["bbb 0", [Date.new 2020 12 1, Date.new 2022 12 1]], ["bbb 1", [Date.new 2020 12 2, Nothing]]]
table.expand_column "bbb" . should_equal expected
group_builder.specify "will work even if keys are not Text" <|
@ -214,7 +234,7 @@ add_specs suite_builder =
table = Table.new [["aaa", [1, 2]], ["bbb", [Map.from_vector [], Map.from_vector []]], ["ccc", [5, 6]]]
r = table.expand_column "bbb"
r.should_fail_with Illegal_Argument
r.catch.message.should_contain "all input objects had no fields"
r.catch.message.should_contain "as all inputs had no fields"
group_builder.specify "will error when fields=[]" <|
table = Table.new [["aaa", [1, 2]], ["bbb", data.uniform_json], ["ccc", [5, 6]]]
@ -249,6 +269,12 @@ add_specs suite_builder =
expected = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
table.expand_to_rows "bbb" . should_equal expected
group_builder.specify "Can expand single values by index" <|
values_to_expand = [3, 4]
table = Table.new [["aaa", [1, 2]], ["bbb", values_to_expand], ["ccc", [5, 6]]]
expected = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
table.expand_to_rows 1 . should_equal expected
group_builder.specify "Can expand Vectors" <|
values_to_expand = [[10, 11], [20, 21, 22], [30]]
table = Table.new [["aaa", [1, 2, 3]], ["bbb", values_to_expand], ["ccc", [5, 6, 7]]]
@ -270,9 +296,9 @@ add_specs suite_builder =
table.expand_to_rows "bbb" . should_equal expected
group_builder.specify "Can expand Pairs" <|
values_to_expand = [Pair.new 10 20, Pair.new "a" [30], Pair.new 40 50]
values_to_expand = [Pair.new 10 20, Pair.new "a" 30, Pair.new 40 50]
table = Table.new [["aaa", [1, 2, 3]], ["bbb", values_to_expand], ["ccc", [5, 6, 7]]]
expected = Table.new [["aaa", [1, 1, 2, 2, 3, 3]], ["bbb", [10, 20, "a", [30], 40, 50]], ["ccc", [5, 5, 6, 6, 7, 7]]]
expected = Table.new [["aaa", [1, 1, 2, 2, 3, 3]], ["bbb", [10, 20, "a", 30, 40, 50]], ["ccc", [5, 5, 6, 6, 7, 7]]]
table.expand_to_rows "bbb" . should_equal expected
group_builder.specify "Can expand Ranges" <|
@ -291,6 +317,18 @@ add_specs suite_builder =
expected = Table.new [["aaa", [1, 1, 2, 2, 2, 3, 3, 3]], ["bbb", values_expanded], ["ccc", [5, 5, 6, 6, 6, 7, 7, 7]]]
table.expand_to_rows "bbb" . should_equal expected
group_builder.specify "Can expand Map" <|
values_to_expand = [Map.empty.insert "a" 10, Map.empty.insert "d" 40 . insert "b" 20, Map.empty.insert "c" 30]
table = Table.new [["aaa", [1, 2, 3]], ["bbb", values_to_expand], ["ccc", [5, 6, 7]]]
expected = Table.new [["aaa", [1, 2, 2, 3]], ["bbb Key", ["a", "d", "b", "c"]], ["bbb", [10, 40, 20, 30]], ["ccc", [5, 6, 6, 7]]]
table.expand_to_rows "bbb" . should_equal expected
group_builder.specify "Can expand JS_Object" <|
values_to_expand = ['{"a": 10}'.parse_json, '{"b": 20, "d": 40}'.parse_json, '{"c": 30}'.parse_json]
table = Table.new [["aaa", [1, 2, 3]], ["bbb", values_to_expand], ["ccc", [5, 6, 7]]]
expected = Table.new [["aaa", [1, 2, 2, 3]], ["bbb Key", ["a", "b", "d", "c"]], ["bbb", [10, 20, 40, 30]], ["ccc", [5, 6, 6, 7]]]
table.expand_to_rows "bbb" . should_equal expected
group_builder.specify "Can expand mixed columns" <|
values_to_expand = [[10, 11], 22.up_to 26, (Date.new 2020 02 28).up_to (Date.new 2020 03 01)]
values_expanded = [10, 11, 22, 23, 24, 25, Date.new 2020 02 28, Date.new 2020 02 29]
@ -345,7 +383,7 @@ add_specs suite_builder =
t.at "Name" . to_vector . should_equal (Vector.fill 5 "Library")
t.at "@catalog" . to_vector . should_equal (Vector.fill 5 "Fiction")
t.at "@letter" . to_vector . should_equal (Vector.fill 5 "A")
t.at "Children Value" . to_vector . map trim_if_text . should_equal ["Hello", "My Book", "World", "Your Book", "Cheap Cars For You"]
t.at "Children" . to_vector . map trim_if_text . should_equal ["Hello", "My Book", "World", "Your Book", "Cheap Cars For You"]
t.at "Children Name" . to_vector . map trim_if_text . should_equal [Nothing, "Book", Nothing, "Book", "Magazine"]
t.at "Children @author" . to_vector . map trim_if_text . should_equal [Nothing, "An Author", Nothing, "Another Author", Nothing]
t.at "Children @month" . to_vector . map trim_if_text . should_equal [Nothing, Nothing, Nothing, Nothing, 'August-2023']

View File

@ -1,2 +1 @@
#license
/FasterXML/jackson-dataformats-binary/blob/2.17/LICENSE

View File

@ -0,0 +1,2 @@
Copyright (c) 2007- Tatu Saloranta, tatu.saloranta@iki.fi
Copyright 2018-2020 Raffaello Giulietti

View File

@ -0,0 +1,2 @@
META-INF/LICENSE
META-INF/jackson-core-LICENSE

View File

@ -0,0 +1 @@
META-INF/jackson-core-NOTICE

View File

@ -0,0 +1,2 @@
Copyright 2010 Google Inc. All Rights Reserved.
Copyright 2011 Google Inc. All Rights Reserved.

View File

@ -1,3 +1,3 @@
58F42EA238F4F16E775412B67F584C74188267FB305705B57A50E10124FE56BC
DC3F2E51015236DC72560E5DD29B13156A2244C6753828B6D0848683018D5ABA
44A2EB4467C91025C305D370F3E8C9430A69FCD957630A539385AB785B0A1C6D
5F5974B8673A2E82B0148235CCE1FC0DD9FB8D3ED9C9552A4D86A4EE14723DE5
0

View File

@ -1 +1 @@
/com-lihaoyi/fansi/blob/master/LICENSE
/com-lihaoyi/Fansi/blob/master/LICENSE

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe-generic-extras/blob/main/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1 +1 @@
/Philippus/bump/blob/main/LICENSE.md
/philippus/bump/blob/main/LICENSE.md

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1 +1 @@
/Philippus/bump/blob/main/LICENSE.md
/philippus/bump/blob/main/LICENSE.md

View File

@ -1,3 +1,2 @@
/pureconfig/pureconfig/blob/master/AUTHORS
/pureconfig/pureconfig/blob/master/LICENSE
#license

View File

@ -1,3 +1,2 @@
/pureconfig/pureconfig/blob/master/AUTHORS
/pureconfig/pureconfig/blob/master/LICENSE
#license

View File

@ -1,3 +1,2 @@
/pureconfig/pureconfig/blob/master/AUTHORS
/pureconfig/pureconfig/blob/master/LICENSE
#license

View File

@ -1,2 +1 @@
#license
/pureconfig/pureconfig/blob/master/LICENSE

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1,2 +1 @@
/circe/circe/blob/series/0.14.x/LICENSE
#license

View File

@ -1 +1 @@
/Philippus/bump/blob/main/LICENSE.md
/philippus/bump/blob/main/LICENSE.md