mirror of
https://github.com/enso-org/enso.git
synced 2024-12-23 01:21:33 +03:00
Add parser for line by line processing (#8719)
- ✅Linting fixes and groups. - ✅Add `File.from that:Text` and use `File` conversions instead of taking both `File` and `Text` and calling `File.new`. - ✅Align Unix Epoc with the UTC timezone and add converting from long value to `Date_Time` using it. - ❌Add simple first logging API allowing writing to log messages from Enso. - ✅Fix minor style issue where a test type had a empty constructor. - ❌Added a `long` based array builder. - Added `File_By_Line` to read a file line by line. - Added "fast" JSON parser based off Jackson. - ✅Altered range `to_vector` to be a proxy Vector. - ✅Added `at` and `get` to `Database.Column`. - ✅Added `get` to `Table.Column`. - ✅Added ability to expand `Vector`, `Array` `Range`, `Date_Range` to columns. - ✅Altered so `expand_to_column` default column name will be the same as the input column (i.e. no `Value` suffix). - ✅Added ability to expand `Map`, `JS_Object` and `Jackson_Object` to rows with two columns coming out (and extra key column). - ✅ Fixed bug where couldn't use integer index to expand to rows.
This commit is contained in:
parent
bb8ff8f89e
commit
eeaddbc434
@ -609,6 +609,8 @@
|
||||
join operations.][8849]
|
||||
- [Attach a warning when Nothing is used as a value in a comparison or `is_in`
|
||||
`Filter_Condition`.][8865]
|
||||
- [Added `File_By_Line` type allowing processing a file line by line. New faster
|
||||
JSON parser based off Jackson.][8719]
|
||||
|
||||
[debug-shortcuts]:
|
||||
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
|
||||
@ -872,6 +874,7 @@
|
||||
[8606]: https://github.com/enso-org/enso/pull/8606
|
||||
[8627]: https://github.com/enso-org/enso/pull/8627
|
||||
[8691]: https://github.com/enso-org/enso/pull/8691
|
||||
[8719]: https://github.com/enso-org/enso/pull/8719
|
||||
[8816]: https://github.com/enso-org/enso/pull/8816
|
||||
[8849]: https://github.com/enso-org/enso/pull/8849
|
||||
[8865]: https://github.com/enso-org/enso/pull/8865
|
||||
|
@ -2598,8 +2598,9 @@ lazy val `std-base` = project
|
||||
Compile / packageBin / artifactPath :=
|
||||
`base-polyglot-root` / "std-base.jar",
|
||||
libraryDependencies ++= Seq(
|
||||
"org.graalvm.polyglot" % "polyglot" % graalMavenPackagesVersion,
|
||||
"org.netbeans.api" % "org-openide-util-lookup" % netbeansApiVersion % "provided"
|
||||
"org.graalvm.polyglot" % "polyglot" % graalMavenPackagesVersion,
|
||||
"org.netbeans.api" % "org-openide-util-lookup" % netbeansApiVersion % "provided",
|
||||
"com.fasterxml.jackson.core" % "jackson-databind" % jacksonVersion
|
||||
),
|
||||
Compile / packageBin := Def.task {
|
||||
val result = (Compile / packageBin).value
|
||||
|
@ -1,6 +1,21 @@
|
||||
Enso
|
||||
Copyright 2020 - 2024 New Byte Order sp. z o. o.
|
||||
|
||||
'jackson-annotations', licensed under the The Apache Software License, Version 2.0, is distributed with the Base.
|
||||
The license file can be found at `licenses/APACHE2.0`.
|
||||
Copyright notices related to this dependency can be found in the directory `com.fasterxml.jackson.core.jackson-annotations-2.15.2`.
|
||||
|
||||
|
||||
'jackson-core', licensed under the The Apache Software License, Version 2.0, is distributed with the Base.
|
||||
The license file can be found at `licenses/APACHE2.0`.
|
||||
Copyright notices related to this dependency can be found in the directory `com.fasterxml.jackson.core.jackson-core-2.15.2`.
|
||||
|
||||
|
||||
'jackson-databind', licensed under the The Apache Software License, Version 2.0, is distributed with the Base.
|
||||
The license file can be found at `licenses/APACHE2.0`.
|
||||
Copyright notices related to this dependency can be found in the directory `com.fasterxml.jackson.core.jackson-databind-2.15.2`.
|
||||
|
||||
|
||||
'icu4j', licensed under the Unicode/ICU License, is distributed with the Base.
|
||||
The license information can be found along with the copyright notices.
|
||||
Copyright notices related to this dependency can be found in the directory `com.ibm.icu.icu4j-73.1`.
|
||||
|
@ -0,0 +1,21 @@
|
||||
# Jackson JSON processor
|
||||
|
||||
Jackson is a high-performance, Free/Open Source JSON processing library.
|
||||
It was originally written by Tatu Saloranta (tatu.saloranta@iki.fi), and has
|
||||
been in development since 2007.
|
||||
It is currently developed by a community of developers.
|
||||
|
||||
## Copyright
|
||||
|
||||
Copyright 2007-, Tatu Saloranta (tatu.saloranta@iki.fi)
|
||||
|
||||
## Licensing
|
||||
|
||||
Jackson 2.x core and extension components are licensed under Apache License 2.0
|
||||
To find the details that apply to this artifact see the accompanying LICENSE file.
|
||||
|
||||
## Credits
|
||||
|
||||
A list of contributors may be found from CREDITS(-2.x) file, which is included
|
||||
in some artifacts (usually source distributions); but is always available
|
||||
from the source code management (SCM) system project uses.
|
@ -0,0 +1,26 @@
|
||||
/*
|
||||
* Copyright 2018-2020 Raffaello Giulietti
|
||||
*
|
||||
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
* of this software and associated documentation files (the "Software"), to deal
|
||||
* in the Software without restriction, including without limitation the rights
|
||||
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
* copies of the Software, and to permit persons to whom the Software is
|
||||
* furnished to do so, subject to the following conditions:
|
||||
*
|
||||
* The above copyright notice and this permission notice shall be included in
|
||||
* all copies or substantial portions of the Software.
|
||||
*
|
||||
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
||||
* THE SOFTWARE.
|
||||
*/
|
||||
|
||||
/* Jackson JSON-processor.
|
||||
*
|
||||
* Copyright (c) 2007- Tatu Saloranta, tatu.saloranta@iki.fi
|
||||
*/
|
@ -0,0 +1,32 @@
|
||||
# Jackson JSON processor
|
||||
|
||||
Jackson is a high-performance, Free/Open Source JSON processing library.
|
||||
It was originally written by Tatu Saloranta (tatu.saloranta@iki.fi), and has
|
||||
been in development since 2007.
|
||||
It is currently developed by a community of developers.
|
||||
|
||||
## Copyright
|
||||
|
||||
Copyright 2007-, Tatu Saloranta (tatu.saloranta@iki.fi)
|
||||
|
||||
## Licensing
|
||||
|
||||
Jackson 2.x core and extension components are licensed under Apache License 2.0
|
||||
To find the details that apply to this artifact see the accompanying LICENSE file.
|
||||
|
||||
## Credits
|
||||
|
||||
A list of contributors may be found from CREDITS(-2.x) file, which is included
|
||||
in some artifacts (usually source distributions); but is always available
|
||||
from the source code management (SCM) system project uses.
|
||||
|
||||
## FastDoubleParser
|
||||
|
||||
jackson-core bundles a shaded copy of FastDoubleParser <https://github.com/wrandelshofer/FastDoubleParser>.
|
||||
That code is available under an MIT license <https://github.com/wrandelshofer/FastDoubleParser/blob/main/LICENSE>
|
||||
under the following copyright.
|
||||
|
||||
Copyright © 2023 Werner Randelshofer, Switzerland. MIT License.
|
||||
|
||||
See FastDoubleParser-NOTICE for details of other source code included in FastDoubleParser
|
||||
and the licenses and copyrights that apply to that code.
|
@ -0,0 +1,21 @@
|
||||
# Jackson JSON processor
|
||||
|
||||
Jackson is a high-performance, Free/Open Source JSON processing library.
|
||||
It was originally written by Tatu Saloranta (tatu.saloranta@iki.fi), and has
|
||||
been in development since 2007.
|
||||
It is currently developed by a community of developers.
|
||||
|
||||
## Copyright
|
||||
|
||||
Copyright 2007-, Tatu Saloranta (tatu.saloranta@iki.fi)
|
||||
|
||||
## Licensing
|
||||
|
||||
Jackson 2.x core and extension components are licensed under Apache License 2.0
|
||||
To find the details that apply to this artifact see the accompanying LICENSE file.
|
||||
|
||||
## Credits
|
||||
|
||||
A list of contributors may be found from CREDITS(-2.x) file, which is included
|
||||
in some artifacts (usually source distributions); but is always available
|
||||
from the source code management (SCM) system project uses.
|
@ -0,0 +1,31 @@
|
||||
/*
|
||||
* Copyright 2010 Google Inc. All Rights Reserved.
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
||||
|
||||
/*
|
||||
* Copyright 2011 Google Inc. All Rights Reserved.
|
||||
*
|
||||
* Licensed under the Apache License, Version 2.0 (the "License");
|
||||
* you may not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing, software
|
||||
* distributed under the License is distributed on an "AS IS" BASIS,
|
||||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
* See the License for the specific language governing permissions and
|
||||
* limitations under the License.
|
||||
*/
|
@ -0,0 +1,201 @@
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
@ -309,7 +309,8 @@ type Any
|
||||
if_nothing self ~other =
|
||||
const self other
|
||||
|
||||
## If `self` is Nothing then returns Nothing, otherwise returns the result
|
||||
## GROUP Logical
|
||||
If `self` is Nothing then returns Nothing, otherwise returns the result
|
||||
of running the provided `action`.
|
||||
|
||||
> Example
|
||||
|
@ -145,9 +145,9 @@ read_text path (encoding=Encoding.utf_8) (on_problems=Problem_Behavior.Report_Wa
|
||||
|
||||
example_list_files =
|
||||
Data.list_directory Examples.data_dir name_filter="**.md" recursive=True
|
||||
list_directory : (File | Text) -> Text -> Boolean -> Vector File
|
||||
list_directory directory name_filter=Nothing recursive=False =
|
||||
File.new directory . list name_filter=name_filter recursive=recursive
|
||||
list_directory : File -> Text | Nothing -> Boolean -> Vector File
|
||||
list_directory directory:File name_filter:Text|Nothing=Nothing recursive:Boolean=False =
|
||||
directory . list name_filter=name_filter recursive=recursive
|
||||
|
||||
## ALIAS download, http get
|
||||
GROUP Input
|
||||
|
208
distribution/lib/Standard/Base/0.0.0-dev/src/Data/Java_Json.enso
Normal file
208
distribution/lib/Standard/Base/0.0.0-dev/src/Data/Java_Json.enso
Normal file
@ -0,0 +1,208 @@
|
||||
import project.Any.Any
|
||||
import project.Data.Array_Proxy.Array_Proxy
|
||||
import project.Data.Json.Invalid_JSON
|
||||
import project.Data.Json.JS_Object
|
||||
import project.Data.Numbers.Integer
|
||||
import project.Data.Numbers.Number
|
||||
import project.Data.Pair.Pair
|
||||
import project.Data.Text.Text
|
||||
import project.Data.Vector.Vector
|
||||
import project.Error.Error
|
||||
import project.Errors.No_Such_Key.No_Such_Key
|
||||
import project.Metadata.Display
|
||||
import project.Metadata.Widget
|
||||
import project.Nothing.Nothing
|
||||
import project.Panic.Panic
|
||||
from project.Data.Boolean import Boolean, False, True
|
||||
from project.Data.Json.Extensions import all
|
||||
from project.Data.Ordering import all
|
||||
from project.Data.Text.Extensions import all
|
||||
from project.Metadata.Choice import Option
|
||||
from project.Metadata.Widget import Single_Choice
|
||||
|
||||
polyglot java import com.fasterxml.jackson.core.JsonProcessingException
|
||||
polyglot java import com.fasterxml.jackson.databind.JsonNode
|
||||
polyglot java import com.fasterxml.jackson.databind.ObjectMapper
|
||||
polyglot java import com.fasterxml.jackson.databind.node.ArrayNode
|
||||
polyglot java import com.fasterxml.jackson.databind.node.JsonNodeType
|
||||
polyglot java import com.fasterxml.jackson.databind.node.ObjectNode
|
||||
|
||||
## PRIVATE
|
||||
Jackson-based JSON Parser
|
||||
type Java_Json
|
||||
parse : Text -> Nothing | Boolean | Number | Text | Vector | Jackson_Object
|
||||
parse text:Text =
|
||||
error_handler js_exception =
|
||||
Error.throw (Invalid_JSON.Error js_exception.payload.message)
|
||||
|
||||
Panic.catch JsonProcessingException handler=error_handler <|
|
||||
node = ObjectMapper.new.readTree text
|
||||
read_json_node node
|
||||
|
||||
## PRIVATE
|
||||
Read a JsonNode to an Enso type
|
||||
read_json_node : JsonNode -> Nothing | Boolean | Number | Text | Vector | Jackson_Object
|
||||
read_json_node node = case node.getNodeType of
|
||||
JsonNodeType.NULL -> Nothing
|
||||
JsonNodeType.BOOLEAN -> node.asBoolean
|
||||
JsonNodeType.STRING -> node.asText
|
||||
JsonNodeType.NUMBER ->
|
||||
if node.isFloatingPointNumber then node.asDouble else node.asLong
|
||||
JsonNodeType.ARRAY -> read_json_array node
|
||||
JsonNodeType.OBJECT -> Jackson_Object.new node
|
||||
|
||||
## PRIVATE
|
||||
Read a JsonNode to a Vector
|
||||
read_json_array : ArrayNode -> Vector
|
||||
read_json_array node =
|
||||
proxy = Array_Proxy.new node.size i-> (read_json_node (node.get i))
|
||||
Vector.from_polyglot_array proxy
|
||||
|
||||
## PRIVATE
|
||||
type Jackson_Object
|
||||
new : ObjectNode -> Jackson_Object
|
||||
new object_node =
|
||||
make_field_names object =
|
||||
name_iterator = object.fieldNames
|
||||
builder = Vector.new_builder object.size
|
||||
loop iterator builder =
|
||||
if iterator.hasNext.not then builder.to_vector else
|
||||
builder.append iterator.next
|
||||
@Tail_Call loop iterator builder
|
||||
loop name_iterator builder
|
||||
Jackson_Object.Value object_node (make_field_names object_node)
|
||||
|
||||
## PRIVATE
|
||||
Value object_node ~field_array
|
||||
|
||||
## GROUP Logical
|
||||
Returns True iff the objects contains the given `key`.
|
||||
contains_key : Text -> Boolean
|
||||
contains_key self key:Text = self.object_node.has key
|
||||
|
||||
## Get a value for a key of the object, or a default value if that key is not present.
|
||||
|
||||
Arguments:
|
||||
- key: The key to get.
|
||||
- if_missing: The value to return if the key is not found.
|
||||
@key make_field_name_selector
|
||||
get : Text -> Any -> Nothing | Boolean | Number | Text | Vector | Jackson_Object
|
||||
get self key:Text ~if_missing=Nothing =
|
||||
if self.contains_key key . not then if_missing else
|
||||
child = self.object_node.get key
|
||||
read_json_node child
|
||||
|
||||
## GROUP Selections
|
||||
Get a value for a key of the object.
|
||||
If the key is not found, throws a `No_Such_Key` error.
|
||||
|
||||
Arguments:
|
||||
- key: The key to get.
|
||||
@key make_field_name_selector
|
||||
at : Text -> Jackson_Object | Boolean | Number | Nothing | Text | Vector ! No_Such_Key
|
||||
at self key:Text = self.get key (Error.throw (No_Such_Key.Error self key))
|
||||
|
||||
## GROUP Metadata
|
||||
Get the keys of the object.
|
||||
field_names : Vector
|
||||
field_names self = self.field_array
|
||||
|
||||
## Maps a function over each value in this object
|
||||
|
||||
Arguments:
|
||||
- function: The function to apply to each value in the map, taking a
|
||||
value and returning a value.
|
||||
map : (Any->Any) -> Vector
|
||||
map self function =
|
||||
kv_func = _ -> function
|
||||
self.map_with_key kv_func
|
||||
|
||||
## Maps a function over each field-value pair in the object.
|
||||
|
||||
Arguments:
|
||||
- function: The function to apply to each key and value in the map,
|
||||
taking a key and a value and returning a value.
|
||||
map_with_key : (Any -> Any -> Any) -> Vector
|
||||
map_with_key self function =
|
||||
self.field_names.map key->
|
||||
value = self.get key
|
||||
function key value
|
||||
|
||||
## GROUP Conversions
|
||||
Convert the object to a Vector of Pairs.
|
||||
to_vector : Vector
|
||||
to_vector self =
|
||||
keys = self.field_array
|
||||
proxy = Array_Proxy.new keys.length (i-> [(keys.at i), (self.get (keys.at i))])
|
||||
Vector.from_polyglot_array proxy
|
||||
|
||||
## GROUP Metadata
|
||||
Gets the number of keys in the object.
|
||||
length : Number
|
||||
length self = self.object_node.size
|
||||
|
||||
## GROUP Logical
|
||||
Returns True iff the Map is empty, i.e., does not have any entries.
|
||||
is_empty : Boolean
|
||||
is_empty self = self.length == 0
|
||||
|
||||
## GROUP Logical
|
||||
Returns True iff the Map is not empty, i.e., has at least one entry.
|
||||
not_empty : Boolean
|
||||
not_empty self = self.is_empty.not
|
||||
|
||||
## PRIVATE
|
||||
Convert the object to a JS_Object.
|
||||
to_js_object : JS_Object
|
||||
to_js_object self =
|
||||
pairs = self.field_names.map name-> [name, self.at name . to_js_object]
|
||||
JS_Object.from_pairs pairs
|
||||
|
||||
## PRIVATE
|
||||
Convert to a Text.
|
||||
to_text : Text
|
||||
to_text self = self.to_json
|
||||
|
||||
## PRIVATE
|
||||
Convert JS_Object to a friendly string.
|
||||
to_display_text : Text
|
||||
to_display_text self =
|
||||
self.to_text.to_display_text
|
||||
|
||||
## PRIVATE
|
||||
Convert to a JSON representation.
|
||||
to_json : Text
|
||||
to_json self = self.object_node.toString
|
||||
|
||||
## PRIVATE
|
||||
Make a field name selector
|
||||
make_field_name_selector : Jackson_Object -> Display -> Widget
|
||||
make_field_name_selector js_object display=Display.Always =
|
||||
Single_Choice display=display values=(js_object.field_names.map n->(Option n n.pretty))
|
||||
|
||||
## Extension for Text to allow use.
|
||||
Text.parse_fast_json : Nothing | Boolean | Number | Text | Vector | Jackson_Object
|
||||
Text.parse_fast_json self = Java_Json.parse self
|
||||
|
||||
## PRIVATE
|
||||
type Jackson_Object_Comparator
|
||||
## PRIVATE
|
||||
compare : Jackson_Object -> Jackson_Object -> (Ordering|Nothing)
|
||||
compare obj1 obj2 =
|
||||
obj1_keys = obj1.field_names
|
||||
obj2_keys = obj2.field_names
|
||||
same_values = obj1_keys.length == obj2_keys.length && obj1_keys.all key->
|
||||
(obj1.get key == obj2.at key).catch No_Such_Key _->False
|
||||
if same_values then Ordering.Equal else Nothing
|
||||
|
||||
## PRIVATE
|
||||
hash : Jackson_Object -> Integer
|
||||
hash obj =
|
||||
values_hashes = obj.field_names.map field_name->
|
||||
val = obj.get field_name
|
||||
Comparable.from val . hash val
|
||||
# Return sum, as we don't care about ordering of field names
|
||||
values_hashes.fold 0 (+)
|
||||
|
||||
## PRIVATE
|
||||
Comparable.from (_:Jackson_Object) = Jackson_Object_Comparator
|
@ -168,18 +168,49 @@ type JS_Object
|
||||
field_names self =
|
||||
Vector.from_polyglot_array (get_property_names self.js_object)
|
||||
|
||||
## Maps a function over each value in this object
|
||||
|
||||
Arguments:
|
||||
- function: The function to apply to each value in the map, taking a
|
||||
value and returning a value.
|
||||
map : (Any->Any) -> Vector
|
||||
map self function =
|
||||
kv_func = _ -> function
|
||||
self.map_with_key kv_func
|
||||
|
||||
## Maps a function over each field-value pair in the object.
|
||||
|
||||
Arguments:
|
||||
- function: The function to apply to each key and value in the map,
|
||||
taking a key and a value and returning a value.
|
||||
map_with_key : (Any -> Any -> Any) -> Vector
|
||||
map_with_key self function =
|
||||
self.field_names.map key->
|
||||
value = self.get key
|
||||
function key value
|
||||
|
||||
## GROUP Metadata
|
||||
Gets the number of keys in the object.
|
||||
length : Number
|
||||
length self =
|
||||
get_property_names self.js_object . length
|
||||
|
||||
## GROUP Logical
|
||||
Returns True iff the Map is empty, i.e., does not have any entries.
|
||||
is_empty : Boolean
|
||||
is_empty self = self.length == 0
|
||||
|
||||
## GROUP Logical
|
||||
Returns True iff the Map is not empty, i.e., has at least one entry.
|
||||
not_empty : Boolean
|
||||
not_empty self = self.is_empty.not
|
||||
|
||||
## GROUP Conversions
|
||||
Convert the object to a Vector of Pairs.
|
||||
Convert the object to a Vector of Key and Values.
|
||||
to_vector : Vector
|
||||
to_vector self =
|
||||
keys = get_property_names self.js_object
|
||||
proxy = Array_Proxy.new keys.length (i-> Pair.new (keys.at i) (self.get (keys.at i)))
|
||||
proxy = Array_Proxy.new keys.length (i-> [(keys.at i), (self.get (keys.at i))])
|
||||
Vector.from_polyglot_array proxy
|
||||
|
||||
## PRIVATE
|
||||
|
@ -1,4 +1,5 @@
|
||||
import project.Any.Any
|
||||
import project.Data.Array_Proxy.Array_Proxy
|
||||
import project.Data.Filter_Condition.Filter_Condition
|
||||
import project.Data.Numbers.Integer
|
||||
import project.Data.Pair.Pair
|
||||
@ -502,7 +503,10 @@ type Range
|
||||
|
||||
1.up_to 6 . to_vector
|
||||
to_vector : Vector Integer
|
||||
to_vector self = self.map x->x
|
||||
to_vector self =
|
||||
proxy = Array_Proxy.new self.length self.at
|
||||
Vector.from_polyglot_array proxy
|
||||
|
||||
|
||||
## Combines all the elements of a non-empty range using a binary operation.
|
||||
If the range is empty, returns `if_empty`.
|
||||
|
@ -1,4 +1,5 @@
|
||||
import project.Any.Any
|
||||
import project.Data.Array_Proxy.Array_Proxy
|
||||
import project.Data.Filter_Condition.Filter_Condition
|
||||
import project.Data.Json.JS_Object
|
||||
import project.Data.Numbers.Integer
|
||||
@ -202,7 +203,9 @@ type Date_Range
|
||||
|
||||
(Date.new 2021 05 07).up_to (Date.new 2021 05 10) . to_vector
|
||||
to_vector : Vector Date
|
||||
to_vector self = self.map x->x
|
||||
to_vector self =
|
||||
proxy = Array_Proxy.new self.length self.at
|
||||
Vector.from_polyglot_array proxy
|
||||
|
||||
## GROUP Logical
|
||||
Checks if this range is empty.
|
||||
|
@ -36,7 +36,7 @@ polyglot java import org.enso.base.Time_Utils
|
||||
|
||||
## PRIVATE
|
||||
unix_epoch_start : Date_Time
|
||||
unix_epoch_start = Date_Time.new 1970
|
||||
unix_epoch_start = Date_Time.new 1970 zone=Time_Zone.utc
|
||||
|
||||
## PRIVATE
|
||||
ensure_in_epoch : (Date_Time | Date) -> (Any -> Any) -> Any
|
||||
@ -281,6 +281,37 @@ type Date_Time
|
||||
parse text:Text format:Date_Time_Formatter=Date_Time_Formatter.default_enso_zoned_date_time =
|
||||
format.parse_date_time text
|
||||
|
||||
## Creates a new `Date_Time` from a Unix epoch timestamp in seconds (and optional nanoseconds).
|
||||
|
||||
Arguments:
|
||||
- seconds: The number of seconds since the Unix epoch.
|
||||
- nanoseconds: The number of nanoseconds within the second.
|
||||
|
||||
> Example
|
||||
Create a new `Date_Time` from a Unix epoch timestamp.
|
||||
|
||||
from Standard.Base import Date_Time
|
||||
|
||||
example_from_unix_epoch = Date_Time.from_unix_epoch_seconds 1601587200
|
||||
from_unix_epoch_seconds : Integer -> Integer -> Date_Time
|
||||
from_unix_epoch_seconds seconds:Integer nanoseconds:Integer=0 =
|
||||
unix_epoch_start + Duration.new seconds=seconds nanoseconds=nanoseconds
|
||||
|
||||
## Creates a new `Date_Time` from a Unix epoch timestamp in milliseconds.
|
||||
|
||||
Arguments:
|
||||
- milliseconds: The number of milliseconds since the Unix epoch.
|
||||
|
||||
> Example
|
||||
Create a new `Date_Time` from a Unix epoch timestamp.
|
||||
|
||||
from Standard.Base import Date_Time
|
||||
|
||||
example_from_unix_epoch = Date_Time.from_unix_epoch_milliseconds 1601587200000
|
||||
from_unix_epoch_milliseconds : Integer -> Time_Zone -> Date_Time
|
||||
from_unix_epoch_milliseconds milliseconds:Integer =
|
||||
unix_epoch_start + Duration.new milliseconds=milliseconds
|
||||
|
||||
## GROUP Metadata
|
||||
Get the year portion of the time.
|
||||
|
||||
|
39
distribution/lib/Standard/Base/0.0.0-dev/src/Logging.enso
Normal file
39
distribution/lib/Standard/Base/0.0.0-dev/src/Logging.enso
Normal file
@ -0,0 +1,39 @@
|
||||
import project.Any.Any
|
||||
import project.Data.Text.Text
|
||||
import project.Meta
|
||||
import project.Nothing.Nothing
|
||||
|
||||
polyglot java import java.util.logging.Logger
|
||||
|
||||
## PRIVATE
|
||||
Log a message.
|
||||
This function needs to be enabled by importing `Standard.Base.Logging` using
|
||||
`from Standard.Base.Logging import all`.
|
||||
Any.log_message : Text -> Log_Level -> Any
|
||||
Any.log_message self ~message:Text level:Log_Level=Log_Level.Info =
|
||||
type_name = Meta.get_qualified_type_name self
|
||||
logger = Logger.getLogger type_name
|
||||
case level of
|
||||
Log_Level.Finest -> logger.finest message
|
||||
Log_Level.Fine -> logger.fine message
|
||||
Log_Level.Info -> logger.info message
|
||||
Log_Level.Warning -> logger.warning message
|
||||
Log_Level.Severe -> logger.severe message
|
||||
self
|
||||
|
||||
## PRIVATE
|
||||
type Log_Level
|
||||
## Finest (Trace) level log message.
|
||||
Finest
|
||||
|
||||
## Fine (Debug) level log message.
|
||||
Fine
|
||||
|
||||
## Info level log message.
|
||||
Info
|
||||
|
||||
## Warning level log message.
|
||||
Warning
|
||||
|
||||
## Severe level log message.
|
||||
Severe
|
@ -191,11 +191,10 @@ type Response_Body
|
||||
|
||||
example_to_file =
|
||||
Examples.get_geo_data.to_file Examples.scratch_file
|
||||
to_file : File | Text -> Existing_File_Behavior -> File
|
||||
to_file self file on_existing_file=Existing_File_Behavior.Backup =
|
||||
to_file : File -> Existing_File_Behavior -> File
|
||||
to_file self file:File on_existing_file=Existing_File_Behavior.Backup =
|
||||
self.with_stream body_stream->
|
||||
f = File.new file
|
||||
r = on_existing_file.write f output_stream->
|
||||
r = on_existing_file.write file output_stream->
|
||||
output_stream.write_stream body_stream
|
||||
r.if_not_error file
|
||||
|
||||
|
@ -23,11 +23,10 @@ from project.Data.Text.Extensions import all
|
||||
|
||||
polyglot java import java.net.URI as Java_URI
|
||||
polyglot java import java.net.URISyntaxException
|
||||
polyglot java import org.graalvm.collections.Pair as Java_Pair
|
||||
|
||||
polyglot java import org.enso.base.enso_cloud.EnsoSecretAccessDenied
|
||||
polyglot java import org.enso.base.net.URITransformer
|
||||
polyglot java import org.enso.base.net.URIWithSecrets
|
||||
polyglot java import org.graalvm.collections.Pair as Java_Pair
|
||||
|
||||
## Represents a Uniform Resource Identifier (URI) reference.
|
||||
type URI
|
||||
|
@ -31,7 +31,8 @@ type Nothing
|
||||
if_nothing : Any -> Any
|
||||
if_nothing self ~function = function
|
||||
|
||||
## If `self` is Nothing then returns Nothing, otherwise returns the result
|
||||
## GROUP Logical
|
||||
If `self` is Nothing then returns Nothing, otherwise returns the result
|
||||
of running the provided `action`.
|
||||
|
||||
> Example
|
||||
|
@ -821,3 +821,7 @@ find_extension_from_name : Text -> Text
|
||||
find_extension_from_name name =
|
||||
extension = name.drop (Text_Sub_Range.Before_Last ".")
|
||||
if extension == "." then "" else extension
|
||||
|
||||
## PRIVATE
|
||||
Convert from a Text to a File.
|
||||
File.from (that:Text) = File.new that
|
||||
|
@ -50,15 +50,14 @@ polyglot java import org.enso.base.Array_Utils
|
||||
|
||||
This allows for building the workflow without affecting the real files.
|
||||
@encoding Encoding.default_widget
|
||||
Text.write : (File|Text) -> Encoding -> Existing_File_Behavior -> Problem_Behavior -> File ! Encoding_Error | Illegal_Argument | File_Error
|
||||
Text.write self path encoding=Encoding.utf_8 on_existing_file=Existing_File_Behavior.Backup on_problems=Problem_Behavior.Report_Warning =
|
||||
Text.write : File -> Encoding -> Existing_File_Behavior -> Problem_Behavior -> File ! Encoding_Error | Illegal_Argument | File_Error
|
||||
Text.write self path:File encoding=Encoding.utf_8 on_existing_file=Existing_File_Behavior.Backup on_problems=Problem_Behavior.Report_Warning =
|
||||
bytes = self.bytes encoding on_problems
|
||||
bytes.if_not_error <|
|
||||
actual = File.new path
|
||||
effective_existing_behaviour = on_existing_file.get_effective_behavior actual
|
||||
file = if Context.Output.is_enabled then actual else
|
||||
effective_existing_behaviour = on_existing_file.get_effective_behavior path
|
||||
file = if Context.Output.is_enabled then path else
|
||||
should_copy_file = on_existing_file==Existing_File_Behavior.Append
|
||||
actual.create_dry_run_file copy_original=should_copy_file
|
||||
path.create_dry_run_file copy_original=should_copy_file
|
||||
|
||||
Context.Output.with_enabled <|
|
||||
r = effective_existing_behaviour.write file stream->
|
||||
@ -97,13 +96,12 @@ Text.write self path encoding=Encoding.utf_8 on_existing_file=Existing_File_Beha
|
||||
import Standard.Examples
|
||||
|
||||
[36, -62, -93, -62, -89, -30, -126, -84, -62, -94].write_bytes Examples.scratch_file.write_bytes Examples.scratch_file Existing_File_Behavior.Append
|
||||
Vector.write_bytes : (File|Text) -> Existing_File_Behavior -> File ! Illegal_Argument | File_Error
|
||||
Vector.write_bytes self path on_existing_file=Existing_File_Behavior.Backup =
|
||||
Vector.write_bytes : File -> Existing_File_Behavior -> File ! Illegal_Argument | File_Error
|
||||
Vector.write_bytes self path:File on_existing_file=Existing_File_Behavior.Backup =
|
||||
Panic.catch Unsupported_Argument_Types handler=(_ -> Error.throw (Illegal_Argument.Error "Only Vectors consisting of bytes (integers in the range from -128 to 127) are supported by the `write_bytes` method.")) <|
|
||||
## Convert to a byte array before writing - and fail early if there is any problem.
|
||||
byte_array = Array_Utils.ensureByteArray self
|
||||
|
||||
file = File.new path
|
||||
r = on_existing_file.write file stream->
|
||||
r = on_existing_file.write path stream->
|
||||
stream.write_bytes (Vector.from_polyglot_array byte_array)
|
||||
r.if_not_error file
|
||||
r.if_not_error path
|
||||
|
@ -0,0 +1,222 @@
|
||||
import project.Any.Any
|
||||
import project.Data.Array_Proxy.Array_Proxy
|
||||
import project.Data.Numbers.Integer
|
||||
import project.Data.Text.Encoding.Encoding
|
||||
import project.Data.Text.Text
|
||||
import project.Data.Vector.Vector
|
||||
import project.Error.Error
|
||||
import project.Errors.Common.Index_Out_Of_Bounds
|
||||
import project.Errors.File_Error.File_Error
|
||||
import project.Nothing.Nothing
|
||||
import project.System.File.File
|
||||
from project.Data.Boolean import Boolean, False, True
|
||||
from project.Data.Range.Extensions import all
|
||||
from project.Data.Text.Extensions import all
|
||||
from project.Logging import all
|
||||
|
||||
polyglot java import org.enso.base.Array_Utils
|
||||
polyglot java import org.enso.base.arrays.LongArrayList
|
||||
polyglot java import org.enso.base.FileLineReader
|
||||
polyglot java import java.io.File as Java_File
|
||||
polyglot java import java.nio.charset.Charset
|
||||
|
||||
type File_By_Line
|
||||
## Creates a new File_By_Line object.
|
||||
|
||||
Arguments
|
||||
- file: The file to read.
|
||||
- encoding: The encoding to use when reading the file (defaults to UTF 8).
|
||||
- offset: The position within the file to read from (defaults to first byte).
|
||||
new : File->Encoding->File_By_Line
|
||||
new file:File encoding:Encoding=Encoding.utf_8 offset:Integer=0 =
|
||||
create_row_map =
|
||||
row_map = LongArrayList.new
|
||||
row_map.add offset
|
||||
File_By_Line.log_message "Created row map"
|
||||
row_map
|
||||
File_By_Line.Reader file encoding Nothing Nothing create_row_map
|
||||
|
||||
## PRIVATE
|
||||
Creates a new File_By_Line object.
|
||||
|
||||
Arguments
|
||||
- file: The file to read.
|
||||
- encoding: The encoding to use when reading the file (defaults to UTF 8).
|
||||
- limit_lines: The number of lines to read (defaults to all lines).
|
||||
- filter_func: The filter to apply to each line (defaults to no filter).
|
||||
- row_map: The row map to use (defaults to a new row map).
|
||||
- file_end: The end of the file in bytes.
|
||||
Reader file:File encoding:Encoding limit_lines:(Integer|Nothing) filter_func row_map file_end=file.size
|
||||
|
||||
## Reads a specific line from the file.
|
||||
|
||||
Arguments
|
||||
- line: The line to read (0 indexed).
|
||||
get : Integer->Text
|
||||
get self line:Integer = if self.limit_lines.is_nothing.not && line>self.limit_lines then Error.throw (Index_Out_Of_Bounds.Error line self.limit_lines) else
|
||||
read_line self line
|
||||
|
||||
## Reads the first line
|
||||
first : Text
|
||||
first self = self.get 0
|
||||
|
||||
## Reads the first line
|
||||
second : Text
|
||||
second self = self.get 1
|
||||
|
||||
## Counts the number of lines in the file.
|
||||
count : Integer
|
||||
count self =
|
||||
end_at = if self.limit_lines.is_nothing then -1 else self.limit_lines
|
||||
for_each_lines self 0 end_at Nothing
|
||||
## We've added all the indexes to the row map including the last one so we need to subtract 1
|
||||
As row_map can be shared if we have a limit return that.
|
||||
if end_at == -1 then self.row_map.getSize-1 else end_at.min self.row_map.getSize-1
|
||||
|
||||
## Returns the lines in the file as a vector.
|
||||
to_vector : Vector Text
|
||||
to_vector self = File_Error.handle_java_exceptions self.file <|
|
||||
end_at = if self.limit_lines.is_nothing then Nothing else self.limit_lines-1
|
||||
FileLineReader.readLines self.java_file self.file_end self.row_map 0 end_at self.charset self.filter_func
|
||||
|
||||
## Performs an action on each line.
|
||||
|
||||
Arguments
|
||||
- function: The action to perform on each line.
|
||||
each : (Text -> Any) -> Nothing
|
||||
each self function =
|
||||
new_function _ t = function t
|
||||
self.each_with_index new_function
|
||||
|
||||
## Performs an action on each line.
|
||||
|
||||
Arguments:
|
||||
- function: A function to apply that takes an index and an item.
|
||||
|
||||
The function is called with both the element index as well as the
|
||||
element itself.
|
||||
each_with_index : (Integer -> Text -> Any) -> Nothing
|
||||
each_with_index self function =
|
||||
end_at = if self.limit_lines.is_nothing then Nothing else self.limit_lines-1
|
||||
for_each_lines self 0 end_at function
|
||||
|
||||
## Transforms each line in the file and returns the result as a vector.
|
||||
|
||||
Arguments
|
||||
- action: The action to perform on each line.
|
||||
map : (Text -> Any) -> Vector Any
|
||||
map self action =
|
||||
builder = Vector.new_builder
|
||||
wrapped_action _ t = builder.append (action t)
|
||||
self.each_with_index wrapped_action
|
||||
builder.to_vector
|
||||
|
||||
## Transforms each line in the file and returns the result as a vector.
|
||||
|
||||
Arguments
|
||||
- action: The action to perform on each line.
|
||||
map_with_index : (Integer -> Text -> Any) -> Vector Any
|
||||
map_with_index self action =
|
||||
builder = Vector.new_builder
|
||||
wrapped_action i t = builder.append (action i t)
|
||||
self.each_with_index wrapped_action
|
||||
builder.to_vector
|
||||
|
||||
## Skips the specified number of lines.
|
||||
|
||||
Arguments
|
||||
- lines: The number of lines to skip.
|
||||
skip : Integer -> File_By_Line
|
||||
skip self lines:Integer =
|
||||
## Read the line
|
||||
create_row_map parent line =
|
||||
if parent.row_map.getSize <= line then for_each_lines self 0 line Nothing
|
||||
position = parent.row_map.getOrLast line
|
||||
row_map = LongArrayList.new
|
||||
row_map.add position
|
||||
parent.log_message "Created Skipped Row Map ("+line.to_text+")"
|
||||
row_map
|
||||
|
||||
new_limit = if self.limit_lines.is_nothing then lines else lines.min self.limit_lines
|
||||
File_By_Line.Reader self.file self.encoding new_limit self.filter_func (create_row_map self lines) self.file_end
|
||||
|
||||
## Limits a file to a specific number of lines.
|
||||
|
||||
Arguments
|
||||
- lines: The number of lines to limit the file to.
|
||||
limit : Integer -> File_By_Line
|
||||
limit self lines:Integer =
|
||||
File_By_Line.Reader self.file self.encoding lines self.filter_func self.row_map self.file_end
|
||||
|
||||
## Filters the file by a predicate.
|
||||
|
||||
Arguments
|
||||
- predicate: The predicate to filter by.
|
||||
filter : Text | (Text -> Boolean) -> File_By_Line
|
||||
filter self predicate =
|
||||
## Create the predicate
|
||||
new_filter = case predicate of
|
||||
_:Text -> FileLineReader.createContainsFilter predicate self.charset
|
||||
_ -> FileLineReader.wrapBooleanFilter predicate self.charset
|
||||
|
||||
## Find the index of the first line matching the new index.
|
||||
make_filter_map parent new_filter =
|
||||
end_at = if parent.limit_lines.is_nothing then -1 else parent.limit_lines-1
|
||||
first_index = FileLineReader.findFirstNewFilter parent.java_file parent.file_end parent.row_map end_at parent.charset parent.filter_func new_filter
|
||||
new_row_map = LongArrayList.new
|
||||
new_row_map.add first_index
|
||||
parent.log_message "Found Filter Start - "+first_index.to_text
|
||||
new_row_map
|
||||
|
||||
## Merge the two predicates together.
|
||||
new_predicate = if self.filter_func.is_nothing then new_filter else
|
||||
FileLineReader.mergeTwoFilters self.filter_func new_filter
|
||||
|
||||
## If the parent is limited need to limit the child by end position in file.
|
||||
if self.limit_lines == Nothing then File_By_Line.Reader self.file self.encoding Nothing new_predicate (make_filter_map self new_filter) self.file_end else
|
||||
## Find the index of the last line matching the new index.
|
||||
index_of parent line =
|
||||
file_len = if parent.row_map.getSize > line then parent.row_map.get line else
|
||||
for_each_lines self 0 line Nothing
|
||||
parent.row_map.get parent.row_map.getSize-1
|
||||
parent.log_message "Created File End ("+line.to_text+") - "+file_len.to_text
|
||||
file_len
|
||||
File_By_Line.Reader self.file self.encoding Nothing new_predicate (make_filter_map self new_filter) (index_of self self.limit_lines)
|
||||
|
||||
## ADVANCED
|
||||
Exports the row_map
|
||||
row_positions : Vector Integer
|
||||
row_positions self = Vector.from_polyglot_array <|
|
||||
Array_Proxy.new self.row_map.getSize (i-> self.row_map.get i)
|
||||
|
||||
## PRIVATE
|
||||
Gets the Java_File for the backing file.
|
||||
java_file : Java_File
|
||||
java_file self = Java_File.new self.file.path
|
||||
|
||||
## PRIVATE
|
||||
Gets the encoding as a Java Charset.
|
||||
charset : Charset
|
||||
charset self = self.encoding.to_java_charset
|
||||
|
||||
## PRIVATE
|
||||
Reads a specific line from the file.
|
||||
read_line : File_By_Line->Integer->Any->Any
|
||||
read_line file:File_By_Line line:Integer=0 ~default=Nothing = File_Error.handle_java_exceptions file.file <|
|
||||
FileLineReader.readSingleLine file.java_file file.file_end file.row_map line file.charset file.filter_func . if_nothing default
|
||||
|
||||
## PRIVATE
|
||||
Performs an action on each line in the file.
|
||||
for_each_lines : File_By_Line->Integer->(Integer|Nothing)->Any->Any
|
||||
for_each_lines file:File_By_Line start_at:Integer end_at:(Integer|Nothing) action = File_Error.handle_java_exceptions file.file <|
|
||||
java_file = file.java_file
|
||||
row_map = file.row_map
|
||||
file_end = file.file_end
|
||||
charset = file.charset
|
||||
|
||||
## First if we haven't read the found the start_at line we need to find that.
|
||||
if start_at >= row_map.getSize then FileLineReader.readSingleLine java_file file_end row_map start_at charset file.filter_func
|
||||
|
||||
## Now we can read the lines we need.
|
||||
if row_map.getOrLast start_at >= file_end then Error.throw (Index_Out_Of_Bounds.Error start_at row_map.getSize) else
|
||||
FileLineReader.forEachLine java_file file_end row_map start_at (end_at.if_nothing -1) charset file.filter_func action
|
@ -7,7 +7,7 @@ type Client_Certificate
|
||||
- cert_file: path to the client certificate file.
|
||||
- key_file: path to the client key file.
|
||||
- key_password: password for the client key file.
|
||||
Value cert_file:(File|Text) key_file:(File|Text) (key_password:Text='')
|
||||
Value cert_file:File key_file:File (key_password:Text='')
|
||||
|
||||
## PRIVATE
|
||||
Creates the JDBC properties for the client certificate.
|
||||
@ -18,5 +18,5 @@ type Client_Certificate
|
||||
- sslpass: password for the client key file.
|
||||
properties : Vector
|
||||
properties self =
|
||||
base = [Pair.new 'sslcert' (File.new self.cert_file).absolute.path, Pair.new 'sslkey' (File.new self.key_file).absolute.path]
|
||||
base = [Pair.new 'sslcert' self.cert_file.absolute.path, Pair.new 'sslkey' self.key_file.absolute.path]
|
||||
if self.key_password == "" then base else base + [Pair.new 'sslpassword' self.key_password]
|
||||
|
@ -8,7 +8,7 @@ type SQLite_Details
|
||||
|
||||
Arguments:
|
||||
- location: Location of the SQLite database to connect to.
|
||||
SQLite (location:(In_Memory|File|Text))
|
||||
SQLite (location:(In_Memory|File))
|
||||
|
||||
## PRIVATE
|
||||
Build the Connection resource.
|
||||
@ -25,7 +25,7 @@ type SQLite_Details
|
||||
jdbc_url : Text
|
||||
jdbc_url self = case self.location of
|
||||
In_Memory -> "jdbc:sqlite::memory:"
|
||||
_ -> "jdbc:sqlite:" + ((File.new self.location).absolute.path.replace '\\' '/')
|
||||
_ -> "jdbc:sqlite:" + (self.location.absolute.path.replace '\\' '/')
|
||||
|
||||
## PRIVATE
|
||||
Provides the properties for the connection.
|
||||
|
@ -12,8 +12,8 @@ type SSL_Mode
|
||||
|
||||
## Will use SSL, validating the certificate but not verifying the hostname.
|
||||
If `ca_file` is `Nothing`, the default CA certificate store will be used.
|
||||
Verify_CA ca_file:Nothing|File|Text=Nothing
|
||||
Verify_CA ca_file:Nothing|File=Nothing
|
||||
|
||||
## Will use SSL, validating the certificate and checking the hostname matches.
|
||||
If `ca_file` is `Nothing`, the default CA certificate store will be used.
|
||||
Full_Verification ca_file:Nothing|File|Text=Nothing
|
||||
Full_Verification ca_file:Nothing|File=Nothing
|
||||
|
@ -1,4 +1,5 @@
|
||||
from Standard.Base import all
|
||||
import Standard.Base.Errors.Common.Index_Out_Of_Bounds
|
||||
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
|
||||
import Standard.Base.Errors.Illegal_State.Illegal_State
|
||||
import Standard.Base.Internal.Rounding_Helpers
|
||||
@ -112,6 +113,44 @@ type Column
|
||||
to_vector : Vector Any
|
||||
to_vector self = self.read max_rows=Nothing . to_vector
|
||||
|
||||
## GROUP Standard.Base.Selections
|
||||
ICON select_row
|
||||
Returns the value contained in this column at the given index.
|
||||
|
||||
Arguments:
|
||||
- index: The index in the column from which to get the value.
|
||||
|
||||
If the value is an NA then this method returns nothing. If the index is
|
||||
not an index in the column it returns an `Index_Out_Of_Bounds`.
|
||||
|
||||
> Example
|
||||
Get the first element from a column.
|
||||
|
||||
import Standard.Examples
|
||||
|
||||
example_at = Examples.integer_column.at 0
|
||||
at : Integer -> (Any | Nothing) ! Index_Out_Of_Bounds
|
||||
at self (index : Integer) =
|
||||
self.get index (Error.throw (Index_Out_Of_Bounds.Error index self.length))
|
||||
|
||||
## GROUP Standard.Base.Selections
|
||||
ICON select_row
|
||||
Returns the value contained in this column at the given index.
|
||||
|
||||
Arguments:
|
||||
- index: The index in the column from which to get the value.
|
||||
- default: The value if the index is out of range.
|
||||
|
||||
> Example
|
||||
Get the first element from a column.
|
||||
|
||||
import Standard.Examples
|
||||
|
||||
example_at = Examples.integer_column.get 0 -1
|
||||
get : Integer -> (Any | Nothing)
|
||||
get self (index : Integer) (~default=Nothing) =
|
||||
self.read index+1 . get index default
|
||||
|
||||
## GROUP Standard.Base.Metadata
|
||||
Returns the `Value_Type` associated with that column.
|
||||
|
||||
|
@ -519,7 +519,7 @@ type Table
|
||||
if Helpers.check_integrity self column then column else
|
||||
Panic.throw (Integrity_Error.Error "Column "+column.name)
|
||||
|
||||
## ALIAS filter rows
|
||||
## ALIAS filter rows, where
|
||||
GROUP Standard.Base.Selections
|
||||
ICON preparation
|
||||
|
||||
@ -2516,8 +2516,8 @@ type Table
|
||||
table = connection.query (SQL_Query.Table_Name "Table")
|
||||
table.write (enso_project.data / "example_csv_output.csv")
|
||||
@format Widget_Helpers.write_table_selector
|
||||
write : File|Text -> File_Format -> Existing_File_Behavior -> Match_Columns -> Problem_Behavior -> Nothing ! Column_Count_Mismatch | Illegal_Argument | File_Error
|
||||
write self path format=Auto_Detect on_existing_file=Existing_File_Behavior.Backup match_columns=Match_Columns.By_Name on_problems=Report_Warning =
|
||||
write : File -> File_Format -> Existing_File_Behavior -> Match_Columns -> Problem_Behavior -> Nothing ! Column_Count_Mismatch | Illegal_Argument | File_Error
|
||||
write self path:File format=Auto_Detect on_existing_file=Existing_File_Behavior.Backup match_columns=Match_Columns.By_Name on_problems=Report_Warning =
|
||||
# TODO This should ideally be done in a streaming manner, or at least respect the row limits.
|
||||
self.read.write path format on_existing_file match_columns on_problems
|
||||
|
||||
|
@ -24,9 +24,9 @@ polyglot java import java.sql.DatabaseMetaData
|
||||
polyglot java import java.sql.PreparedStatement
|
||||
polyglot java import java.sql.SQLException
|
||||
polyglot java import java.sql.SQLTimeoutException
|
||||
polyglot java import org.graalvm.collections.Pair as Java_Pair
|
||||
polyglot java import org.enso.database.dryrun.OperationSynchronizer
|
||||
polyglot java import org.enso.database.JDBCProxy
|
||||
polyglot java import org.graalvm.collections.Pair as Java_Pair
|
||||
|
||||
type JDBC_Connection
|
||||
## PRIVATE
|
||||
|
@ -2056,8 +2056,26 @@ type Column
|
||||
example_at = Examples.integer_column.at 0
|
||||
at : Integer -> (Any | Nothing) ! Index_Out_Of_Bounds
|
||||
at self (index : Integer) =
|
||||
self.get index (Error.throw (Index_Out_Of_Bounds.Error index self.length))
|
||||
|
||||
## GROUP Standard.Base.Selections
|
||||
ICON select_row
|
||||
Returns the value contained in this column at the given index.
|
||||
|
||||
Arguments:
|
||||
- index: The index in the column from which to get the value.
|
||||
- default: The value if the index is out of range.
|
||||
|
||||
> Example
|
||||
Get the first element from a column.
|
||||
|
||||
import Standard.Examples
|
||||
|
||||
example_at = Examples.integer_column.get 0 -1
|
||||
get : Integer -> (Any | Nothing)
|
||||
get self (index : Integer) (~default=Nothing) =
|
||||
valid_index = (index >= 0) && (index < self.length)
|
||||
if valid_index.not then Error.throw (Index_Out_Of_Bounds.Error index self.length) else
|
||||
if valid_index.not then default else
|
||||
storage = self.java_column.getStorage
|
||||
if storage.isNa index then Nothing else
|
||||
storage.getItem index
|
||||
|
@ -1,5 +1,6 @@
|
||||
from Standard.Base import all
|
||||
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
|
||||
import Standard.Base.Data.Java_Json.Jackson_Object
|
||||
|
||||
## PRIVATE
|
||||
A special type describing how to convert an object into a set of table
|
||||
@ -15,6 +16,10 @@ type Convertible_To_Columns
|
||||
Convertible_To_Columns.from (that:JS_Object) =
|
||||
Convertible_To_Columns.Value that.field_names that.get
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Columns.from (that:Jackson_Object) =
|
||||
Convertible_To_Columns.Value that.field_names that.get
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Columns.from (that:Map) =
|
||||
pairs = that.keys.map k-> [k.to_text, k]
|
||||
@ -23,6 +28,24 @@ Convertible_To_Columns.from (that:Map) =
|
||||
Error.throw (Illegal_Argument.Error "Cannot convert "+that.to_display_text+" to a set of columns, because its keys are duplicated when converted to text.")
|
||||
Convertible_To_Columns.Value field_map.keys (k-> that.get (field_map.get k))
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Columns.from (that:Pair) =
|
||||
Convertible_To_Columns.Value ["0", "1"] (k-> if k == "0" then that.first else that.second)
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Columns.from (that:Vector) =
|
||||
fields = 0.up_to that.length . map _.to_text
|
||||
Convertible_To_Columns.Value fields (k-> that.at (Integer.parse k))
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Columns.from (that:Array) = Convertible_To_Columns.from that.to_vector
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Columns.from (that:Range) = Convertible_To_Columns.from that.to_vector
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Columns.from (that:Date_Range) = Convertible_To_Columns.from that.to_vector
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Columns.from (that:Any) =
|
||||
name = "Value"
|
||||
|
@ -1,4 +1,7 @@
|
||||
from Standard.Base import all
|
||||
import Standard.Base.Data.Java_Json.Jackson_Object
|
||||
|
||||
import project.Data.Conversions.Convertible_To_Columns.Convertible_To_Columns
|
||||
|
||||
## PRIVATE
|
||||
A special type that is used to define what types can be converted to a table
|
||||
@ -14,7 +17,9 @@ type Convertible_To_Rows
|
||||
Arguments:
|
||||
- length: The number of rows in the table.
|
||||
- getter: Get the value for a specified row.
|
||||
Value length:Integer (getter : Integer->Any)
|
||||
- columns: The names for the columns when object is expanded.
|
||||
These will be added to the name of the input column.
|
||||
Value length:Integer (getter : Integer->Any) (columns:Vector=["Value"])
|
||||
|
||||
## PRIVATE
|
||||
Return the iterator values as a `Vector`.
|
||||
@ -39,6 +44,51 @@ Convertible_To_Rows.from that:Pair = Convertible_To_Rows.Value that.length that.
|
||||
## PRIVATE
|
||||
Convertible_To_Rows.from that:Date_Range = Convertible_To_Rows.Value that.length that.get
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Rows.from that:Map =
|
||||
vals = that.to_vector.map p-> Key_Value.Pair p.first p.second
|
||||
Convertible_To_Rows.Value vals.length vals.get ["Key", "Value"]
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Rows.from that:JS_Object =
|
||||
vals = that.map_with_key k->v-> Key_Value.Pair k v
|
||||
Convertible_To_Rows.Value vals.length vals.get ["Key", "Value"]
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Rows.from that:Jackson_Object =
|
||||
vals = that.map_with_key k->v-> Key_Value.Pair k v
|
||||
Convertible_To_Rows.Value vals.length vals.get ["Key", "Value"]
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Rows.from (that:Any) =
|
||||
Convertible_To_Rows.Value 1 (n-> if n==0 then that else Nothing)
|
||||
|
||||
## PRIVATE
|
||||
type Key_Value
|
||||
## PRIVATE
|
||||
Arguments:
|
||||
- key: The key of the pair.
|
||||
- value: The value of the pair.
|
||||
Pair key:Any value:Any
|
||||
|
||||
## PRIVATE
|
||||
at self idx = self.get idx
|
||||
|
||||
## PRIVATE
|
||||
Return the key of the pair.
|
||||
get self idx = case idx of
|
||||
0 -> self.key
|
||||
1 -> self.value
|
||||
"Key" -> self.key
|
||||
"Value" -> self.value
|
||||
_ -> Nothing
|
||||
|
||||
## PRIVATE
|
||||
is_empty self = False
|
||||
|
||||
## PRIVATE
|
||||
length self = 2
|
||||
|
||||
## PRIVATE
|
||||
Convertible_To_Columns.from (that:Key_Value) =
|
||||
Convertible_To_Columns.Value ["Key", "Value"] (k-> if k == "Key" then that.key else that.value)
|
@ -1347,7 +1347,7 @@ type Table
|
||||
expand_to_rows self column at_least_one_row=False =
|
||||
Expand_Objects_Helpers.expand_to_rows self column at_least_one_row
|
||||
|
||||
## ALIAS filter rows
|
||||
## ALIAS filter rows, where
|
||||
GROUP Standard.Base.Selections
|
||||
ICON preparation
|
||||
|
||||
@ -2487,14 +2487,13 @@ type Table
|
||||
|
||||
example_to_xlsx = Examples.inventory_table.write (enso_project.data / "example_xlsx_output.xlsx") Excel
|
||||
@format Widget_Helpers.write_table_selector
|
||||
write : File|Text -> File_Format -> Existing_File_Behavior -> Match_Columns -> Problem_Behavior -> File ! Column_Count_Mismatch | Illegal_Argument | File_Error
|
||||
write self path format=Auto_Detect on_existing_file=Existing_File_Behavior.Backup match_columns=Match_Columns.By_Name on_problems=Report_Warning =
|
||||
file = File.new path
|
||||
write : File -> File_Format -> Existing_File_Behavior -> Match_Columns -> Problem_Behavior -> File ! Column_Count_Mismatch | Illegal_Argument | File_Error
|
||||
write self path:File format=Auto_Detect on_existing_file=Existing_File_Behavior.Backup match_columns=Match_Columns.By_Name on_problems=Report_Warning =
|
||||
case format of
|
||||
_ : Auto_Detect ->
|
||||
base_format = format.get_writing_format file
|
||||
if base_format == Nothing then Error.throw (File_Error.Unsupported_Output_Type file Table) else
|
||||
self.write file format=base_format on_existing_file match_columns on_problems
|
||||
base_format = format.get_writing_format path
|
||||
if base_format == Nothing then Error.throw (File_Error.Unsupported_Output_Type path Table) else
|
||||
self.write path format=base_format on_existing_file match_columns on_problems
|
||||
_ ->
|
||||
handle_no_write_method caught_panic =
|
||||
is_write = caught_panic.payload.method_name == "write_table"
|
||||
@ -2502,7 +2501,7 @@ type Table
|
||||
Error.throw (File_Error.Unsupported_Output_Type format Table)
|
||||
Panic.catch No_Such_Method handler=handle_no_write_method <|
|
||||
to_write = if Context.Output.is_enabled then self else self.take 1000
|
||||
format.write_table file to_write on_existing_file match_columns on_problems
|
||||
format.write_table path to_write on_existing_file match_columns on_problems
|
||||
|
||||
## Creates a text representation of the table using the CSV format.
|
||||
to_csv : Text
|
||||
|
@ -33,22 +33,19 @@ type Excel_Workbook
|
||||
- file: The file to load.
|
||||
- xls_format: Whether to use the old XLS format (default is XLSX).
|
||||
- headers: Whether to use the first row as headers (default is to infer).
|
||||
new : File | Text | Temporary_File -> Boolean -> Boolean | Infer -> Excel_Workbook
|
||||
new file xls_format=False headers=Infer =
|
||||
file_obj = case file of
|
||||
tmp : Temporary_File -> tmp
|
||||
other -> File.new other
|
||||
file_for_errors = if file_obj.is_a Temporary_File then Nothing else file_obj
|
||||
new : File | Temporary_File -> Boolean -> Boolean | Infer -> Excel_Workbook
|
||||
new file:(File|Temporary_File) xls_format=False headers=Infer =
|
||||
file_for_errors = if file.is_a Temporary_File then Nothing else file
|
||||
|
||||
continuation raw_file =
|
||||
format = if xls_format then ExcelFileFormat.XLS else ExcelFileFormat.XLSX
|
||||
File_Error.handle_java_exceptions raw_file <| Excel_Reader.handle_bad_format file_for_errors <| Illegal_State.handle_java_exception <|
|
||||
# The `java_file` depends on the liveness of the possible `Temporary_File` but that is ensured by storing the `file_obj` in the resulting workbook instance.
|
||||
# The `java_file` depends on the liveness of the possible `Temporary_File` but that is ensured by storing the `file` in the resulting workbook instance.
|
||||
java_file = Java_File.new raw_file.absolute.normalize.path
|
||||
excel_connection_resource = Managed_Resource.register (ExcelConnectionPool.INSTANCE.openReadOnlyConnection java_file format) close_connection
|
||||
Excel_Workbook.Value (Ref.new excel_connection_resource) file_obj xls_format headers
|
||||
Excel_Workbook.Value (Ref.new excel_connection_resource) file xls_format headers
|
||||
|
||||
case file_obj of
|
||||
case file of
|
||||
tmp : Temporary_File -> tmp.with_file continuation
|
||||
f : File -> continuation f
|
||||
|
||||
|
@ -28,9 +28,13 @@ expand_column (table : Table) (column : Text | Integer) (fields : (Vector Text)
|
||||
Prefix_Name.None -> ""
|
||||
Prefix_Name.Column_Name -> column_object.name+" "
|
||||
Prefix_Name.Custom value -> value
|
||||
default_name = case prefix of
|
||||
Prefix_Name.None -> "Value"
|
||||
Prefix_Name.Column_Name -> column_object.name
|
||||
Prefix_Name.Custom value -> value
|
||||
naming_strategy = table.column_naming_helper.create_unique_name_strategy
|
||||
naming_strategy.mark_used (table.column_names.filter (c->c!=column_object.name))
|
||||
new_names = naming_strategy.make_all_unique (expanded.column_names.map n-> resolved_prefix+n)
|
||||
new_names = naming_strategy.make_all_unique (expanded.column_names.map n-> if n=='Value' then default_name else resolved_prefix+n)
|
||||
new_columns = new_names.zip expanded.columns (n->c-> c.rename n)
|
||||
|
||||
## Create Merged Columns
|
||||
@ -74,13 +78,16 @@ expand_column (table : Table) (column : Text | Integer) (fields : (Vector Text)
|
||||
# => Table.new [["aaa", [1, 1, 2, 2]], ["bbb", [30, 31, 40, 41]]]
|
||||
@column Widget_Helpers.make_column_name_selector
|
||||
expand_to_rows : Table -> Text | Integer -> Boolean -> Table ! Type_Error | No_Such_Column | Index_Out_Of_Bounds
|
||||
expand_to_rows table column at_least_one_row=False =
|
||||
expand_to_rows table column:(Text|Integer) at_least_one_row=False = if column.is_a Integer then expand_to_rows table (table.at column).name at_least_one_row else
|
||||
row_expander : Any -> Vector
|
||||
row_expander value:Convertible_To_Rows = value.to_vector
|
||||
|
||||
column_names : Any -> Vector
|
||||
column_names value:Convertible_To_Rows = value.columns.map name-> if name=="Value" then column else column+" "+name
|
||||
|
||||
Java_Problems.with_problem_aggregator Problem_Behavior.Report_Warning java_problem_aggregator->
|
||||
builder size = make_inferred_builder size java_problem_aggregator
|
||||
Fan_Out.fan_out_to_rows table column row_expander at_least_one_row column_builder=builder
|
||||
Fan_Out.fan_out_to_rows table column row_expander column_names at_least_one_row column_builder=builder
|
||||
|
||||
## PRIVATE
|
||||
create_table_from_objects : Convertible_To_Rows -> (Vector Text | Nothing) -> Table
|
||||
@ -130,6 +137,6 @@ create_table_from_objects (value : Convertible_To_Rows) (fields : Vector | Nothi
|
||||
columns = case preset_fields of
|
||||
True -> fields.distinct.map column_map.get
|
||||
False ->
|
||||
if discovered_field_names.is_empty then Error.throw (Illegal_Argument.Error "Unable to discover expected column names, because all input objects had no fields. Specify fields explicitly if you need a constant set of expected columns.") else
|
||||
if discovered_field_names.is_empty then Error.throw (Illegal_Argument.Error "Unable to generate column names as all inputs had no fields.") else
|
||||
discovered_field_names.to_vector.map column_map.get
|
||||
Table.new columns
|
||||
|
@ -1,6 +1,8 @@
|
||||
from Standard.Base import all
|
||||
import Standard.Base.Runtime.Ref.Ref
|
||||
|
||||
import project.Data.Column.Column
|
||||
import project.Data.Conversions.Convertible_To_Rows.Key_Value
|
||||
import project.Data.Table.Table
|
||||
import project.Data.Type.Value_Type.Value_Type
|
||||
import project.Internal.Problem_Builder.Problem_Builder
|
||||
@ -40,16 +42,20 @@ fan_out_to_columns table input_column_id function column_count=Nothing column_bu
|
||||
- input_column: The column to transform.
|
||||
- function: A function that transforms a single element of `input_column`
|
||||
to multiple values.
|
||||
- column_names: The names for the generated columns or a call back to create
|
||||
the names for each row.
|
||||
- at_least_one_row: When true, if the function returns an empty list, a
|
||||
single row is output with `Nothing` for the transformed column. If false,
|
||||
the row is not output at all.
|
||||
fan_out_to_rows : Table -> Text | Integer -> (Any -> Vector Any) -> Boolean -> (Integer -> Any) -> Problem_Behavior -> Table
|
||||
fan_out_to_rows table input_column_id function at_least_one_row=False column_builder=make_string_builder on_problems=Report_Error =
|
||||
fan_out_to_rows : Table -> Text -> (Any -> Vector Any) -> Vector | Function -> Boolean -> (Integer -> Any) -> Problem_Behavior -> Table
|
||||
fan_out_to_rows table input_column_id:Text function column_names=[input_column_id] at_least_one_row=False column_builder=make_string_builder on_problems=Report_Error =
|
||||
## Treat this as a special case of fan_out_to_rows_and_columns, with one
|
||||
column. Wrap the provided function to convert each value to a singleton
|
||||
`Vector`.
|
||||
wrapped_function x = function x . map y-> [y]
|
||||
column_names = [input_column_id]
|
||||
wrapped_function x = function x . map y-> case y of
|
||||
_:Vector -> y
|
||||
_:Key_Value -> y
|
||||
_ -> [y]
|
||||
fan_out_to_rows_and_columns table input_column_id wrapped_function column_names at_least_one_row=at_least_one_row column_builder=column_builder on_problems=on_problems
|
||||
|
||||
## PRIVATE
|
||||
@ -99,63 +105,126 @@ fan_out_to_rows_and_columns table input_column_id function column_names at_least
|
||||
|
||||
input_column = table.at input_column_id
|
||||
input_storage = input_column.java_column.getStorage
|
||||
num_input_rows = input_storage.size
|
||||
|
||||
num_output_columns = column_names.length
|
||||
|
||||
# Guess that most of the time, we'll get at least one value for each input.
|
||||
initial_size = input_column.length
|
||||
# Accumulates the outputs of the function.
|
||||
output_column_builders = Vector.new num_output_columns _-> column_builder initial_size
|
||||
# Accumulates repeated position indices for the order mask.
|
||||
order_mask_positions = Vector.new_builder initial_size
|
||||
|
||||
maybe_add_empty_row vecs =
|
||||
should_add_empty_row = vecs.is_empty && at_least_one_row
|
||||
if should_add_empty_row.not then vecs else
|
||||
empty_row = Vector.fill num_output_columns Nothing
|
||||
[empty_row]
|
||||
|
||||
0.up_to num_input_rows . each i->
|
||||
input_value = input_storage.getItemBoxed i
|
||||
output_values = function input_value |> maybe_add_empty_row
|
||||
# Append each group of values to the builder.
|
||||
output_values.each row_unchecked->
|
||||
row = uniform_length num_output_columns row_unchecked problem_builder
|
||||
row.each_with_index i-> v-> output_column_builders.at i . append v
|
||||
# Append n copies of the input row position, n = # of output values.
|
||||
repeat_each output_values.length <| order_mask_positions.append i
|
||||
# Create the columns and a mask.
|
||||
pair = if column_names.is_a Vector then fan_out_to_rows_and_columns_fixed input_storage function at_least_one_row column_names column_builder problem_builder else
|
||||
fan_out_to_rows_and_columns_dynamic input_storage function at_least_one_row column_names column_builder problem_builder
|
||||
raw_output_columns = pair.first
|
||||
order_mask_positions = pair.second
|
||||
|
||||
# Reserve the non-input column names that will not be changing.
|
||||
non_input_columns = table.columns.filter c-> c.name != input_column.name
|
||||
unique.mark_used <| non_input_columns.map .name
|
||||
|
||||
# Build the output column
|
||||
output_storages = output_column_builders.map .seal
|
||||
output_columns = output_storages.map_with_index i-> output_storage->
|
||||
column_name = unique.make_unique <| column_names.at i
|
||||
Column.from_storage column_name output_storage
|
||||
# Make output columns unique.
|
||||
output_columns = raw_output_columns.map column->
|
||||
column_name = unique.make_unique column.name
|
||||
column.rename column_name
|
||||
|
||||
# Build the order mask.
|
||||
order_mask = OrderMask.fromArray (order_mask_positions.to_vector)
|
||||
|
||||
## Build the new table, replacing the input column with the new output
|
||||
columns.
|
||||
## Build the new table, replacing the input column with the new output columns.
|
||||
new_columns_unflattened = table.columns.map column->
|
||||
case column.name == input_column_id of
|
||||
True ->
|
||||
# Replace the input column with the output columns.
|
||||
output_columns
|
||||
False ->
|
||||
# Build a new column from the old one with the mask
|
||||
old_storage = column.java_column.getStorage
|
||||
new_storage = old_storage.applyMask order_mask
|
||||
[Column.from_storage column.name new_storage]
|
||||
new_columns = new_columns_unflattened.flatten
|
||||
|
||||
new_table = Table.new new_columns
|
||||
# Replace the input column with the output columns.
|
||||
if column.name == input_column_id then output_columns else
|
||||
# Build a new column from the old one with the mask
|
||||
old_storage = column.java_column.getStorage
|
||||
new_storage = old_storage.applyMask order_mask
|
||||
[Column.from_storage column.name new_storage]
|
||||
new_table = Table.new new_columns_unflattened.flatten
|
||||
problem_builder.attach_problems_after on_problems new_table
|
||||
|
||||
## PRIVATE
|
||||
Inner method for fan_out_to_rows_and_columns where the column names are fixed.
|
||||
fan_out_to_rows_and_columns_fixed : Any -> (Any -> Vector (Vector Any)) -> Boolean -> Vector Text -> (Integer -> Any) -> Problem_Builder -> Vector
|
||||
fan_out_to_rows_and_columns_fixed input_storage function at_least_one_row:Boolean column_names:Vector column_builder problem_builder =
|
||||
num_output_columns = column_names.length
|
||||
num_input_rows = input_storage.size
|
||||
|
||||
# Accumulates the outputs of the function.
|
||||
output_column_builders = Vector.new num_output_columns _-> column_builder num_input_rows
|
||||
|
||||
# Accumulates repeated position indices for the order mask.
|
||||
order_mask_positions = Vector.new_builder num_input_rows
|
||||
|
||||
empty_row = [Vector.fill num_output_columns Nothing]
|
||||
maybe_add_empty_row vecs = if vecs.is_empty && at_least_one_row then empty_row else vecs
|
||||
|
||||
0.up_to num_input_rows . each i->
|
||||
input_value = input_storage.getItemBoxed i
|
||||
output_values = maybe_add_empty_row (function input_value)
|
||||
|
||||
output_values.each row_unchecked->
|
||||
row = uniform_length num_output_columns row_unchecked problem_builder
|
||||
row.each_with_index i-> v-> output_column_builders.at i . append v
|
||||
|
||||
# Append n copies of the input row position, n = # of output values.
|
||||
repeat_each output_values.length <| order_mask_positions.append i
|
||||
|
||||
output_columns = column_names.map_with_index i->n->
|
||||
Column.from_storage n (output_column_builders.at i . seal)
|
||||
|
||||
[output_columns, order_mask_positions]
|
||||
|
||||
## PRIVATE
|
||||
Inner method for fan_out_to_rows_and_columns where the column names are determined by each row.
|
||||
fan_out_to_rows_and_columns_dynamic : Any -> (Any -> Vector (Vector Any)) -> Boolean -> (Any -> Text) -> (Integer -> Any) -> Problem_Builder -> Vector
|
||||
fan_out_to_rows_and_columns_dynamic input_storage function at_least_one_row column_names_for_row column_builder problem_builder =
|
||||
# Accumulates the outputs of the function.
|
||||
column_map = Ref.new Map.empty
|
||||
output_column_builders = Vector.new_builder
|
||||
|
||||
# Guess that most of the time, we'll get at least one value for each input.
|
||||
num_input_rows = input_storage.size
|
||||
|
||||
# Column Builder add function
|
||||
add_column n current_length =
|
||||
column_map.put (column_map.get.insert n output_column_builders.length)
|
||||
builder = column_builder num_input_rows
|
||||
builder.appendNulls current_length
|
||||
output_column_builders.append builder
|
||||
|
||||
# Accumulates repeated position indices for the order mask.
|
||||
order_mask_positions = Vector.new_builder num_input_rows
|
||||
|
||||
maybe_add_empty_row vecs = if (vecs.is_empty && at_least_one_row).not then vecs else
|
||||
[Vector.fill output_column_builders.length Nothing]
|
||||
|
||||
0.up_to num_input_rows . each i->
|
||||
input_value = input_storage.getItemBoxed i
|
||||
output_values = maybe_add_empty_row (function input_value)
|
||||
|
||||
# get the column names for the row.
|
||||
row_column_names = column_names_for_row input_value
|
||||
|
||||
# Add any missing columns.
|
||||
row_column_names.each n->
|
||||
if column_map.get.contains_key n . not then
|
||||
add_column n order_mask_positions.length
|
||||
|
||||
# Append each group of values to the builder.
|
||||
current_columns = column_map.get
|
||||
output_values.each row_unchecked->
|
||||
row = uniform_length row_column_names.length row_unchecked problem_builder
|
||||
row_column_names.each_with_index i->n->
|
||||
output_column_builders.at (current_columns.at n) . append (row.at i)
|
||||
|
||||
# Fill in values for any column not present
|
||||
if row_column_names.length != output_column_builders.length then
|
||||
current_columns.each_with_key k->i->
|
||||
if row_column_names.contains k . not then
|
||||
output_column_builders.at i . appendNulls output_values.length
|
||||
|
||||
# Append n copies of the input row position, n = # of output values.
|
||||
repeat_each output_values.length <| order_mask_positions.append i
|
||||
|
||||
# Build the output column
|
||||
output_columns = column_map.get.to_vector.sort on=_.second . map pair->
|
||||
Column.from_storage pair.first (output_column_builders.at pair.second . seal)
|
||||
|
||||
[output_columns, order_mask_positions]
|
||||
|
||||
## PRIVATE
|
||||
|
||||
Map a multi-valued function over a column and return the results as set of
|
||||
|
@ -11,16 +11,16 @@ split_to_columns : Table -> Text | Integer -> Text -> Integer | Nothing -> Probl
|
||||
split_to_columns table input_column_id delimiter="," column_count=Nothing on_problems=Report_Error =
|
||||
column = table.at input_column_id
|
||||
Value_Type.expect_text column <|
|
||||
fan_out_to_columns table input_column_id (handle_nothing (_.split delimiter)) column_count on_problems=on_problems
|
||||
fan_out_to_columns table column.name (handle_nothing (_.split delimiter)) column_count on_problems=on_problems
|
||||
|
||||
## PRIVATE
|
||||
Splits a column of text into a set of new rows.
|
||||
See `Table.split_to_rows`.
|
||||
split_to_rows : Table -> Text | Integer -> Text -> Table
|
||||
split_to_rows table input_column_id delimiter="," =
|
||||
split_to_rows table input_column_id:(Text|Integer) delimiter="," =
|
||||
column = table.at input_column_id
|
||||
Value_Type.expect_text column
|
||||
fan_out_to_rows table input_column_id (handle_nothing (_.split delimiter)) at_least_one_row=True
|
||||
Value_Type.expect_text column <|
|
||||
fan_out_to_rows table column.name (handle_nothing (_.split delimiter)) at_least_one_row=True
|
||||
|
||||
## PRIVATE
|
||||
Tokenizes a column of text into a set of new columns using a regular
|
||||
@ -29,8 +29,8 @@ split_to_rows table input_column_id delimiter="," =
|
||||
tokenize_to_columns : Table -> Text | Integer -> Text -> Case_Sensitivity -> Integer | Nothing -> Problem_Behavior -> Table
|
||||
tokenize_to_columns table input_column_id pattern case_sensitivity column_count on_problems =
|
||||
column = table.at input_column_id
|
||||
Value_Type.expect_text column
|
||||
fan_out_to_columns table input_column_id (handle_nothing (_.tokenize pattern case_sensitivity)) column_count on_problems=on_problems
|
||||
Value_Type.expect_text column <|
|
||||
fan_out_to_columns table column.name (handle_nothing (_.tokenize pattern case_sensitivity)) column_count on_problems=on_problems
|
||||
|
||||
## PRIVATE
|
||||
Tokenizes a column of text into a set of new rows using a regular
|
||||
@ -39,8 +39,8 @@ tokenize_to_columns table input_column_id pattern case_sensitivity column_count
|
||||
tokenize_to_rows : Table -> Text | Integer -> Text -> Case_Sensitivity -> Boolean -> Table
|
||||
tokenize_to_rows table input_column_id pattern="." case_sensitivity=Case_Sensitivity.Sensitive at_least_one_row=False =
|
||||
column = table.at input_column_id
|
||||
Value_Type.expect_text column
|
||||
fan_out_to_rows table input_column_id (handle_nothing (_.tokenize pattern case_sensitivity)) at_least_one_row=at_least_one_row
|
||||
Value_Type.expect_text column <|
|
||||
fan_out_to_rows table column.name (handle_nothing (_.tokenize pattern case_sensitivity)) at_least_one_row=at_least_one_row
|
||||
|
||||
## PRIVATE
|
||||
Converts a Text column into new columns using a regular expression
|
||||
@ -54,13 +54,14 @@ parse_to_columns table input_column_id (pattern:(Text | Regex)=".") case_sensiti
|
||||
case_insensitive = case_sensitivity.is_case_insensitive_in_memory
|
||||
Regex.compile pattern case_insensitive=case_insensitive
|
||||
|
||||
fun = handle_nothing (regex_parse_to_vectors regex)
|
||||
column_names = regex_to_column_names regex input_column_id
|
||||
|
||||
column = table.at input_column_id
|
||||
|
||||
fun = handle_nothing (regex_parse_to_vectors regex)
|
||||
column_names = regex_to_column_names regex column.name
|
||||
|
||||
|
||||
new_table = Value_Type.expect_text column <|
|
||||
fan_out_to_rows_and_columns table input_column_id fun column_names at_least_one_row=True on_problems=on_problems
|
||||
fan_out_to_rows_and_columns table column.name fun column_names at_least_one_row=True on_problems=on_problems
|
||||
if parse_values then new_table.parse on_problems=on_problems else new_table
|
||||
|
||||
## PRIVATE
|
||||
|
@ -118,7 +118,7 @@ make_filter_condition_selector table display=Display.Always =
|
||||
builder.append (Option "Less Than" fqn+".Less" [["than", col_names]])
|
||||
builder.append (Option "Less Than Or Equal" fqn+".Equal_Or_Less" [["than", col_names]])
|
||||
builder.append (Option "Greater Than" fqn+".Greater" [["than", col_names]])
|
||||
builder.append (Option "Greater Than Or Equal" fqn+".Greater_Or_Less" [["than", col_names]])
|
||||
builder.append (Option "Greater Than Or Equal" fqn+".Equal_Or_Greater" [["than", col_names]])
|
||||
builder.append (Option "Between" fqn+".Between" [["lower", col_names], ["upper", col_names]])
|
||||
builder.append (Option "Equals Ignore Case" fqn+".Equal_Ignore_Case" [["to", col_names]])
|
||||
builder.append (Option "Starts With" fqn+".Starts_With" [["prefix", col_names]])
|
||||
|
@ -20,4 +20,4 @@ file_uploading path =
|
||||
- file_path: The path at which the file is being uploaded.
|
||||
type File_Being_Uploaded
|
||||
## PRIVATE
|
||||
Value file_path
|
||||
Value file_path:Text
|
||||
|
@ -43,31 +43,46 @@ case class GithubHeuristic(info: DependencyInformation, log: Logger) {
|
||||
*/
|
||||
def tryDownloadingAttachments(address: String): Seq[Attachment] =
|
||||
try {
|
||||
val homePage = url(address).cat.!!
|
||||
val fileRegex = """<a .*? href="(.*?)".*?>(.*?)</a>""".r("href", "name")
|
||||
val matches = fileRegex
|
||||
.findAllMatchIn(homePage)
|
||||
.map(m => (m.group("name"), m.group("href")))
|
||||
.filter(p => mayBeRelevant(p._1))
|
||||
.toList
|
||||
matches.flatMap { case (_, href) =>
|
||||
try {
|
||||
val content =
|
||||
url("https://github.com" + href.replace("blob", "raw")).cat.!!
|
||||
Seq(
|
||||
AttachedFile(
|
||||
PortablePath.of(href),
|
||||
content,
|
||||
origin = Some("github.com")
|
||||
)
|
||||
)
|
||||
} catch {
|
||||
case NonFatal(error) =>
|
||||
log.warn(
|
||||
s"Found file $href but cannot download it: $error"
|
||||
)
|
||||
Seq()
|
||||
}
|
||||
val homePage = url(address).cat.!!
|
||||
val branchRegex = """"defaultBranch":"([^"]*?)"""".r("branch")
|
||||
val branch = branchRegex.findFirstMatchIn(homePage).map(_.group("branch"))
|
||||
branch match {
|
||||
case None =>
|
||||
log.warn(s"Cannot find default branch for $address")
|
||||
Seq()
|
||||
case Some(branch) =>
|
||||
val fileRegex =
|
||||
"""\{"name":"([^"]*?)","path":"([^"]*?)","contentType":"file"\}"""
|
||||
.r("name", "path")
|
||||
val matches = fileRegex
|
||||
.findAllMatchIn(homePage)
|
||||
.map(m => (m.group("name"), m.group("path")))
|
||||
.filter(p => mayBeRelevant(p._1))
|
||||
.toList
|
||||
matches.flatMap { case (_, path) =>
|
||||
val rawHref = address + "/raw/" + branch + "/" + path
|
||||
// This path is reconstructed to match the 'legacy' format for compatibility with older versions of the review settings.
|
||||
// It has the format <org>/<repo>/blob/<branch>/<path>
|
||||
val internalPath = address
|
||||
.stripPrefix("https://github.com")
|
||||
.stripSuffix("/") + "/blob/" + branch + "/" + path
|
||||
try {
|
||||
val content = url(rawHref).cat.!!
|
||||
Seq(
|
||||
AttachedFile(
|
||||
PortablePath.of(internalPath),
|
||||
content,
|
||||
origin = Some(address)
|
||||
)
|
||||
)
|
||||
} catch {
|
||||
case NonFatal(error) =>
|
||||
log.warn(
|
||||
s"Found file $rawHref but cannot download it: $error"
|
||||
)
|
||||
Seq()
|
||||
}
|
||||
}
|
||||
}
|
||||
} catch {
|
||||
case NonFatal(error) =>
|
||||
|
@ -6,7 +6,11 @@ import java.io.OutputStream;
|
||||
import java.nio.Buffer;
|
||||
import java.nio.ByteBuffer;
|
||||
import java.nio.CharBuffer;
|
||||
import java.nio.charset.*;
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.CharsetDecoder;
|
||||
import java.nio.charset.CharsetEncoder;
|
||||
import java.nio.charset.CoderResult;
|
||||
import java.nio.charset.CodingErrorAction;
|
||||
import java.util.Arrays;
|
||||
import java.util.function.BiConsumer;
|
||||
import java.util.function.Function;
|
||||
@ -99,7 +103,7 @@ public class Encoding_Utils {
|
||||
* @return the resulting string
|
||||
*/
|
||||
public static ResultWithWarnings<String> from_bytes(byte[] bytes, Charset charset) {
|
||||
if (bytes.length == 0) {
|
||||
if (bytes == null || bytes.length == 0) {
|
||||
return new ResultWithWarnings<>("");
|
||||
}
|
||||
|
||||
|
417
std-bits/base/src/main/java/org/enso/base/FileLineReader.java
Normal file
417
std-bits/base/src/main/java/org/enso/base/FileLineReader.java
Normal file
@ -0,0 +1,417 @@
|
||||
package org.enso.base;
|
||||
|
||||
import com.ibm.icu.text.Normalizer2;
|
||||
import java.io.ByteArrayOutputStream;
|
||||
import java.io.File;
|
||||
import java.io.FileInputStream;
|
||||
import java.io.IOException;
|
||||
import java.nio.MappedByteBuffer;
|
||||
import java.nio.channels.FileChannel;
|
||||
import java.nio.charset.Charset;
|
||||
import java.nio.charset.StandardCharsets;
|
||||
import java.util.ArrayList;
|
||||
import java.util.List;
|
||||
import java.util.function.BiConsumer;
|
||||
import java.util.function.Function;
|
||||
import java.util.logging.Level;
|
||||
import java.util.logging.Logger;
|
||||
import org.enso.base.arrays.LongArrayList;
|
||||
import org.graalvm.polyglot.Context;
|
||||
|
||||
/** A reader for reading lines from a file one at a time. */
|
||||
public class FileLineReader {
|
||||
public static class ByteArrayOutputStreamWithContains extends ByteArrayOutputStream {
|
||||
public ByteArrayOutputStreamWithContains(int size) {
|
||||
super(size);
|
||||
}
|
||||
|
||||
/** Creates a preloaded stream from a byte array. */
|
||||
public static ByteArrayOutputStreamWithContains fromByteArray(byte[] bytes) {
|
||||
var stream = new ByteArrayOutputStreamWithContains(0);
|
||||
stream.buf = bytes;
|
||||
stream.count = bytes.length;
|
||||
return stream;
|
||||
}
|
||||
|
||||
/**
|
||||
* Computes the longest prefix for the given byte array. Based on <a
|
||||
* href="https://www.geeksforgeeks.org/kmp-algorithm-for-pattern-searching/">Geeks for geeks</a>
|
||||
*/
|
||||
public static int[] computeLongestPrefix(byte[] bytes) {
|
||||
int[] longestPrefix = new int[bytes.length];
|
||||
|
||||
int i = 1;
|
||||
int len = 0;
|
||||
while (i < bytes.length) {
|
||||
if (bytes[i] == bytes[len]) {
|
||||
len++;
|
||||
longestPrefix[i++] = len;
|
||||
} else if (len == 0) {
|
||||
longestPrefix[i++] = 0;
|
||||
} else {
|
||||
len = longestPrefix[len - 1];
|
||||
}
|
||||
}
|
||||
|
||||
return longestPrefix;
|
||||
}
|
||||
|
||||
/** Checks if the stream contains the given byte array. */
|
||||
public boolean contains(byte[] bytes, int[] longestPrefix) {
|
||||
// ToDo: Needs to deal with the Unicode scenario where the next character is a combining
|
||||
// character. #8900
|
||||
if (bytes.length > count) {
|
||||
return false;
|
||||
}
|
||||
|
||||
int i = 0;
|
||||
int j = 0;
|
||||
while ((count - i) >= (bytes.length - j)) {
|
||||
if (buf[i] == bytes[j]) {
|
||||
i++;
|
||||
j++;
|
||||
}
|
||||
|
||||
if (j == bytes.length) {
|
||||
return true;
|
||||
}
|
||||
|
||||
if (i < count && buf[i] != bytes[j]) {
|
||||
if (j != 0) {
|
||||
j = longestPrefix[j - 1];
|
||||
} else {
|
||||
i++;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
private static class CancellationToken {
|
||||
public boolean isCancelled = false;
|
||||
|
||||
public void cancel() {
|
||||
isCancelled = true;
|
||||
}
|
||||
}
|
||||
|
||||
private static final Logger LOGGER = Logger.getLogger("enso-file-line-reader");
|
||||
|
||||
/** Amount of data to read at a time for a single line (4KB). */
|
||||
private static final int LINE_BUFFER = 4 * 1024;
|
||||
|
||||
/** Amount of data to read at a time (4MB). */
|
||||
private static final int BUFFER_SIZE = 4 * 1024 * 1024;
|
||||
|
||||
private static boolean moreToRead(int c, MappedByteBuffer buffer) {
|
||||
return switch (c) {
|
||||
case '\n', -1 -> false;
|
||||
case '\r' -> {
|
||||
c = buffer.hasRemaining() ? buffer.get() : '\n';
|
||||
if (c != '\n') {
|
||||
buffer.position(buffer.position() - 1);
|
||||
}
|
||||
yield false;
|
||||
}
|
||||
default -> true;
|
||||
};
|
||||
}
|
||||
|
||||
private static int readByte(MappedByteBuffer buffer) {
|
||||
return buffer.hasRemaining() ? buffer.get() : -1;
|
||||
}
|
||||
|
||||
/**
|
||||
* Reads a line into an OutputStream. Returns true if the end of the line was found, false if the
|
||||
* buffer finished.
|
||||
*/
|
||||
private static boolean readLine(MappedByteBuffer buffer, ByteArrayOutputStream result) {
|
||||
int c = readByte(buffer);
|
||||
while (moreToRead(c, buffer)) {
|
||||
result.write(c);
|
||||
c = readByte(buffer);
|
||||
Context.getCurrent().safepoint();
|
||||
}
|
||||
return c != -1 && (c != '\r' || buffer.hasRemaining());
|
||||
}
|
||||
|
||||
/**
|
||||
* Scans forward one line. Returns true if the end of the line was found, false if the buffer
|
||||
* finished.
|
||||
*/
|
||||
private static boolean scanLine(MappedByteBuffer buffer) {
|
||||
int c = readByte(buffer);
|
||||
while (moreToRead(c, buffer)) {
|
||||
c = readByte(buffer);
|
||||
Context.getCurrent().safepoint();
|
||||
}
|
||||
return c != -1 && (c != '\r' || buffer.hasRemaining());
|
||||
}
|
||||
|
||||
/** Reads a line from a file at the given index using the existing rowMap. */
|
||||
private static String readLineByIndex(
|
||||
File file, long length, LongArrayList rowMap, int index, Charset charset) throws IOException {
|
||||
if (index >= rowMap.getSize()) {
|
||||
throw new IndexOutOfBoundsException(index);
|
||||
}
|
||||
|
||||
long position = rowMap.get(index);
|
||||
if (position >= length) {
|
||||
return null;
|
||||
}
|
||||
long toRead =
|
||||
rowMap.getSize() > index + 1 ? rowMap.get(index + 1) - position : length - position;
|
||||
|
||||
// Output buffer
|
||||
var outputStream = new ByteArrayOutputStream(128);
|
||||
|
||||
// Only read what we have to.
|
||||
try (var stream = new FileInputStream(file)) {
|
||||
var channel = stream.getChannel();
|
||||
int bufferSize = (int) Math.min(LINE_BUFFER, toRead);
|
||||
long remaining = toRead - bufferSize;
|
||||
var buffer = channel.map(FileChannel.MapMode.READ_ONLY, position, bufferSize);
|
||||
var result = readLine(buffer, outputStream);
|
||||
while (!result && remaining > 0) {
|
||||
position += bufferSize;
|
||||
bufferSize = (int) Math.min(LINE_BUFFER, remaining);
|
||||
remaining -= bufferSize;
|
||||
buffer = channel.map(FileChannel.MapMode.READ_ONLY, position, bufferSize);
|
||||
result = readLine(buffer, outputStream);
|
||||
}
|
||||
}
|
||||
|
||||
return outputStream.toString(charset);
|
||||
}
|
||||
|
||||
/** Scans forward in a file and returns the line at the given index. */
|
||||
public static String readSingleLine(
|
||||
File file,
|
||||
long length,
|
||||
LongArrayList rowMap,
|
||||
int index,
|
||||
Charset charset,
|
||||
Function<ByteArrayOutputStreamWithContains, String> filter)
|
||||
throws IOException {
|
||||
int size = rowMap.getSize();
|
||||
if (index != -1 && size > index) {
|
||||
return readLineByIndex(file, length, rowMap, index, charset);
|
||||
}
|
||||
|
||||
// Start at the last known line and scan forward.
|
||||
return forEachLine(file, length, rowMap, size - 1, index, charset, filter, null);
|
||||
}
|
||||
|
||||
/** Scans forward in a file reading line by line. Returning all the matching lines. */
|
||||
public static List<String> readLines(
|
||||
File file,
|
||||
long length,
|
||||
LongArrayList rowMap,
|
||||
int startAt,
|
||||
int endAt,
|
||||
Charset charset,
|
||||
Function<ByteArrayOutputStreamWithContains, String> filter)
|
||||
throws IOException {
|
||||
List<String> result = new ArrayList<>();
|
||||
forEachLine(
|
||||
file, length, rowMap, startAt, endAt, charset, filter, (index, line) -> result.add(line));
|
||||
return result;
|
||||
}
|
||||
|
||||
/**
|
||||
* Scans forward in a file reading line by line.
|
||||
*
|
||||
* @param file The file to read.
|
||||
* @param rowMap The rowMap to use.
|
||||
* @param startAt The index to start at.
|
||||
* @param endAt The index to end at (inclusive).
|
||||
* @param charset The charset to use.
|
||||
* @param filter The filter to apply to each line.
|
||||
* @param action The action to apply to each line (optional).
|
||||
* @return The last line read or null if end of file is reached.
|
||||
*/
|
||||
public static String forEachLine(
|
||||
File file,
|
||||
long length,
|
||||
LongArrayList rowMap,
|
||||
int startAt,
|
||||
int endAt,
|
||||
Charset charset,
|
||||
Function<ByteArrayOutputStreamWithContains, String> filter,
|
||||
BiConsumer<Integer, String> action)
|
||||
throws IOException {
|
||||
return innerForEachLine(
|
||||
file, length, rowMap, startAt, endAt, charset, filter, action, new CancellationToken());
|
||||
}
|
||||
|
||||
private static String innerForEachLine(
|
||||
File file,
|
||||
long length,
|
||||
LongArrayList rowMap,
|
||||
int startAt,
|
||||
int endAt,
|
||||
Charset charset,
|
||||
Function<ByteArrayOutputStreamWithContains, String> filter,
|
||||
BiConsumer<Integer, String> action,
|
||||
CancellationToken cancellationToken)
|
||||
throws IOException {
|
||||
if (startAt >= rowMap.getSize()) {
|
||||
throw new IndexOutOfBoundsException(startAt);
|
||||
}
|
||||
int index = action == null ? rowMap.getSize() - 1 : startAt;
|
||||
|
||||
long position = rowMap.get(index);
|
||||
if (position >= length) {
|
||||
return null;
|
||||
}
|
||||
|
||||
boolean readAll = filter != null || action != null || endAt == -1;
|
||||
var outputStream = new ByteArrayOutputStreamWithContains(128);
|
||||
String output = null;
|
||||
|
||||
try (var stream = new FileInputStream(file)) {
|
||||
var channel = stream.getChannel();
|
||||
|
||||
var bufferSize = (int) Math.min(BUFFER_SIZE, (length - position));
|
||||
var truncated = bufferSize != (length - position);
|
||||
var buffer = channel.map(FileChannel.MapMode.READ_ONLY, position, bufferSize);
|
||||
|
||||
// Loop until we either reach the required record or run out of data.
|
||||
while (!cancellationToken.isCancelled
|
||||
&& (endAt == -1 || index <= endAt)
|
||||
&& (truncated || buffer.hasRemaining())) {
|
||||
var linePosition = buffer.position() + position;
|
||||
|
||||
// Read a line.
|
||||
outputStream.reset();
|
||||
boolean success =
|
||||
(readAll || index == endAt) ? readLine(buffer, outputStream) : scanLine(buffer);
|
||||
|
||||
if (success || !truncated) {
|
||||
String line = null;
|
||||
if (filter == null || (line = filter.apply(outputStream)) != null) {
|
||||
if (index >= rowMap.getSize()) {
|
||||
rowMap.add(linePosition);
|
||||
}
|
||||
|
||||
if (action != null) {
|
||||
line = line == null ? outputStream.toString(charset) : line;
|
||||
action.accept(index, line);
|
||||
}
|
||||
|
||||
if (index == endAt) {
|
||||
output = line == null ? outputStream.toString(charset) : line;
|
||||
}
|
||||
|
||||
if (index % 100000 == 0) {
|
||||
LOGGER.log(Level.INFO, "Scanned Lines: {0}", index);
|
||||
}
|
||||
index++;
|
||||
|
||||
// If no filter we can record the start of the next line.
|
||||
if (filter == null && index == rowMap.getSize()) {
|
||||
rowMap.add(buffer.position() + position);
|
||||
}
|
||||
}
|
||||
|
||||
// Fast-forward if needed
|
||||
if (filter != null && index < rowMap.getSize()) {
|
||||
int newPosition = Math.min(bufferSize, (int) (rowMap.get(index) - position));
|
||||
buffer.position(newPosition);
|
||||
}
|
||||
} else {
|
||||
// Read more if we need to
|
||||
if (!buffer.hasRemaining()) {
|
||||
position = linePosition;
|
||||
bufferSize = (int) Math.min(BUFFER_SIZE, (length - position));
|
||||
truncated = bufferSize != (length - position);
|
||||
buffer = channel.map(FileChannel.MapMode.READ_ONLY, position, bufferSize);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (!truncated && !buffer.hasRemaining() && rowMap.get(rowMap.getSize() - 1) != length) {
|
||||
// Add the last line to mark reached the end.
|
||||
rowMap.add(length);
|
||||
}
|
||||
|
||||
return output;
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Scans forward in a file reading line by line until it finds a line that matches the new filter.
|
||||
*/
|
||||
public static long findFirstNewFilter(
|
||||
File file,
|
||||
long length,
|
||||
LongArrayList rowMap,
|
||||
int endAt,
|
||||
Charset charset,
|
||||
Function<ByteArrayOutputStreamWithContains, String> filter,
|
||||
Function<ByteArrayOutputStreamWithContains, String> newFilter)
|
||||
throws IOException {
|
||||
final CancellationToken token = new CancellationToken();
|
||||
final List<Long> result = new ArrayList<>();
|
||||
BiConsumer<Integer, String> action =
|
||||
(index, line) -> {
|
||||
var bytes = line.getBytes(charset);
|
||||
var outputStream = ByteArrayOutputStreamWithContains.fromByteArray(bytes);
|
||||
if (newFilter.apply(outputStream) != null) {
|
||||
result.add(rowMap.get(index));
|
||||
token.cancel();
|
||||
}
|
||||
};
|
||||
innerForEachLine(file, length, rowMap, 0, endAt, charset, filter, action, token);
|
||||
return result.isEmpty() ? rowMap.get(rowMap.getSize() - 1) : result.get(0);
|
||||
}
|
||||
|
||||
/** Creates a filter that checks if the line contains the given string. */
|
||||
public static Function<ByteArrayOutputStreamWithContains, String> createContainsFilter(
|
||||
String contains, Charset charset) {
|
||||
if (isUnicodeCharset(charset)) {
|
||||
var nfcVersion = Normalizer2.getNFCInstance().normalize(contains);
|
||||
var nfdVersion = Normalizer2.getNFDInstance().normalize(contains);
|
||||
if (!nfcVersion.equals(nfdVersion)) {
|
||||
// Need to use Unicode normalization for equality.
|
||||
return (outputStream) -> {
|
||||
var line = outputStream.toString(charset);
|
||||
return Text_Utils.contains(contains, line) ? line : null;
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
var bytes = contains.getBytes(charset);
|
||||
var prefixes = ByteArrayOutputStreamWithContains.computeLongestPrefix(bytes);
|
||||
return (outputStream) ->
|
||||
outputStream.contains(bytes, prefixes) ? outputStream.toString(charset) : null;
|
||||
}
|
||||
|
||||
/** Wraps an Enso function filter in a FileLineReader filter. */
|
||||
public static Function<ByteArrayOutputStreamWithContains, String> wrapBooleanFilter(
|
||||
Function<String, Boolean> filter, Charset charset) {
|
||||
return (outputStream) -> {
|
||||
var line = outputStream.toString(charset);
|
||||
return filter.apply(line) ? line : null;
|
||||
};
|
||||
}
|
||||
|
||||
/** Joins two filters together. */
|
||||
public static Function<ByteArrayOutputStreamWithContains, String> mergeTwoFilters(
|
||||
Function<ByteArrayOutputStreamWithContains, String> first,
|
||||
Function<ByteArrayOutputStreamWithContains, String> second) {
|
||||
return (outputStream) -> {
|
||||
var first_result = first.apply(outputStream);
|
||||
return first_result != null ? second.apply(outputStream) : null;
|
||||
};
|
||||
}
|
||||
|
||||
private static boolean isUnicodeCharset(Charset charset) {
|
||||
return charset == StandardCharsets.UTF_8
|
||||
|| charset == StandardCharsets.UTF_16
|
||||
|| charset == StandardCharsets.UTF_16BE
|
||||
|| charset == StandardCharsets.UTF_16LE;
|
||||
}
|
||||
}
|
@ -0,0 +1,44 @@
|
||||
package org.enso.base.arrays;
|
||||
|
||||
import java.util.Arrays;
|
||||
|
||||
/** A helper to efficiently build an array of unboxed longs of arbitrary length. */
|
||||
public class LongArrayList {
|
||||
private long[] backingStorage;
|
||||
private int lastIndex = -1;
|
||||
|
||||
public LongArrayList() {
|
||||
backingStorage = new long[32];
|
||||
}
|
||||
|
||||
// ** Gets the number of elements in the list. */
|
||||
public int getSize() {
|
||||
return lastIndex + 1;
|
||||
}
|
||||
|
||||
// ** Gets an element from the list. */
|
||||
public long get(int index) {
|
||||
if (index > lastIndex) {
|
||||
throw new IndexOutOfBoundsException(index);
|
||||
}
|
||||
return backingStorage[index];
|
||||
}
|
||||
|
||||
// ** Gets an element from the list. */
|
||||
public long getOrLast(int index) {
|
||||
return backingStorage[Math.min(index, lastIndex)];
|
||||
}
|
||||
|
||||
// ** Adds an element to the list. */
|
||||
public void add(long x) {
|
||||
int index;
|
||||
|
||||
index = lastIndex + 1;
|
||||
if (index >= backingStorage.length) {
|
||||
backingStorage = Arrays.copyOf(backingStorage, backingStorage.length * 2);
|
||||
}
|
||||
|
||||
backingStorage[index] = x;
|
||||
lastIndex = index;
|
||||
}
|
||||
}
|
@ -44,6 +44,32 @@ add_specs suite_builder =
|
||||
(Date_Time.new 2022 12 12).should_equal (Date_Time.new 2022 12 12)
|
||||
(Date_Time.new 2022 12 12).should_not_equal (Date_Time.new 1996)
|
||||
|
||||
suite_builder.group "Unix epoch conversion" group_builder->
|
||||
group_builder.specify "should allow creating from second and nanosecond" <|
|
||||
Date_Time.from_unix_epoch_seconds 0 . should_equal (Date_Time.new 1970 1 1 zone=Time_Zone.utc)
|
||||
Date_Time.from_unix_epoch_seconds 1 . should_equal (Date_Time.new 1970 1 1 0 0 1 zone=Time_Zone.utc)
|
||||
Date_Time.from_unix_epoch_seconds 1 123456789 . should_equal (Date_Time.new 1970 1 1 0 0 1 nanosecond=123456789 zone=Time_Zone.utc)
|
||||
Date_Time.from_unix_epoch_seconds 1704371744 . should_equal (Date_Time.new 2024 1 4 12 35 44 zone=Time_Zone.utc)
|
||||
Date_Time.from_unix_epoch_seconds 1704371744 123456789 . should_equal (Date_Time.new 2024 1 4 12 35 44 nanosecond=123456789 zone=Time_Zone.utc)
|
||||
|
||||
group_builder.specify "should allow creating from milliseconds" <|
|
||||
Date_Time.from_unix_epoch_milliseconds 0 . should_equal (Date_Time.new 1970 1 1 zone=Time_Zone.utc)
|
||||
Date_Time.from_unix_epoch_milliseconds 1 . should_equal (Date_Time.new 1970 1 1 0 0 0 millisecond=1 zone=Time_Zone.utc)
|
||||
Date_Time.from_unix_epoch_milliseconds 123 . should_equal (Date_Time.new 1970 1 1 0 0 0 millisecond=123 zone=Time_Zone.utc)
|
||||
Date_Time.from_unix_epoch_milliseconds 1704371744123 . should_equal (Date_Time.new 2024 1 4 12 35 44 millisecond=123 zone=Time_Zone.utc)
|
||||
|
||||
group_builder.specify "should allow convert to epoch seconds" <|
|
||||
Date_Time.new 1970 zone=Time_Zone.utc . to_unix_epoch_seconds . should_equal 0
|
||||
Date_Time.new 1970 1 1 0 0 1 zone=Time_Zone.utc . to_unix_epoch_seconds . should_equal 1
|
||||
Date_Time.new 1970 1 1 0 0 1 nanosecond=123456789 zone=Time_Zone.utc . to_unix_epoch_seconds . should_equal 1
|
||||
Date_Time.new 2024 1 4 12 35 44 zone=Time_Zone.utc . to_unix_epoch_seconds . should_equal 1704371744
|
||||
|
||||
group_builder.specify "should allow convert to epoch milliseconds" <|
|
||||
Date_Time.new 1970 zone=Time_Zone.utc . to_unix_epoch_milliseconds . should_equal 0
|
||||
Date_Time.new 1970 1 1 0 0 0 millisecond=1 zone=Time_Zone.utc . to_unix_epoch_milliseconds . should_equal 1
|
||||
Date_Time.new 1970 1 1 0 0 0 millisecond=1 microsecond=123 zone=Time_Zone.utc . to_unix_epoch_milliseconds . should_equal 1
|
||||
Date_Time.new 2024 1 4 12 35 44 millisecond=123 zone=Time_Zone.utc . to_unix_epoch_milliseconds . should_equal 1704371744123
|
||||
|
||||
spec_with suite_builder name create_new_datetime parse_datetime nanoseconds_loss_in_precision=False =
|
||||
suite_builder.group name group_builder->
|
||||
|
||||
|
@ -200,7 +200,7 @@ add_specs suite_builder setup =
|
||||
case setup.test_selection.supports_mixed_columns of
|
||||
False -> callback_with_clue (setup.table_builder table_structure)
|
||||
True ->
|
||||
all_combinations (Vector.fill table_structure.length [Nothing, Mixed_Type_Object.Value]) . each combination->
|
||||
all_combinations (Vector.fill table_structure.length [Nothing, Mixed_Type_Object]) . each combination->
|
||||
amended_table_structure = table_structure.zip combination column_definition-> prefix->
|
||||
name = column_definition.first
|
||||
values = column_definition.second
|
||||
@ -1753,7 +1753,6 @@ add_specs suite_builder setup =
|
||||
|
||||
# A dummy value used to force the in-memory backend to trigger a infer a mixed type for the given column.
|
||||
type Mixed_Type_Object
|
||||
Value
|
||||
|
||||
all_combinations variables =
|
||||
result = Vector.new_builder
|
||||
|
@ -1,4 +1,5 @@
|
||||
from Standard.Base import all
|
||||
import Standard.Base.Errors.Common.Index_Out_Of_Bounds
|
||||
import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
|
||||
|
||||
from Standard.Table import Table, Sort_Column
|
||||
@ -170,7 +171,19 @@ add_specs (suite_builder : Suite_Builder) (prefix : Text) (create_connection_fn
|
||||
group_builder.specify "should allow to materialize columns directly into a Vector" <|
|
||||
v = data.t1.at 'a' . to_vector
|
||||
v . should_equal [1, 4]
|
||||
|
||||
|
||||
group_builder.specify "should allow getting specific elements" <|
|
||||
test_column = data.t1.at 'a'
|
||||
test_column.get 0 . should_equal 1
|
||||
test_column.get 3 . should_equal Nothing
|
||||
test_column.get 4 -1 . should_equal -1
|
||||
|
||||
group_builder.specify "should allow getting specific elements (with at)" <|
|
||||
test_column = data.t1.at 'a'
|
||||
test_column.at 0 . should_equal 1
|
||||
test_column.at 1 . should_equal 4
|
||||
test_column.at 3 . should_fail_with Index_Out_Of_Bounds
|
||||
|
||||
group_builder.specify "should handle bigger result sets" <|
|
||||
data.big_table.read.row_count . should_equal data.big_size
|
||||
|
||||
|
@ -26,6 +26,13 @@ add_specs suite_builder =
|
||||
empty_column = Column.from_vector "Test" []
|
||||
|
||||
group_builder.specify "should allow getting specific elements" <|
|
||||
test_column.get 0 . should_equal 1
|
||||
test_column.get 2 . should_equal 5
|
||||
test_column.get 5 . should_equal 6
|
||||
test_column.get 6 . should_equal Nothing
|
||||
empty_column.get 0 -1 . should_equal -1
|
||||
|
||||
group_builder.specify "should allow getting specific elements (with at)" <|
|
||||
test_column.at 0 . should_equal 1
|
||||
test_column.at 2 . should_equal 5
|
||||
test_column.at 5 . should_equal 6
|
||||
|
@ -12,6 +12,7 @@ import project.In_Memory.Lossy_Conversions_Spec
|
||||
import project.In_Memory.Parse_To_Table_Spec
|
||||
import project.In_Memory.Split_Tokenize_Spec
|
||||
import project.In_Memory.Table_Spec
|
||||
import project.In_Memory.Table_Conversion_Spec
|
||||
import project.In_Memory.Table_Date_Spec
|
||||
import project.In_Memory.Table_Date_Time_Spec
|
||||
import project.In_Memory.Table_Time_Of_Day_Spec
|
||||
@ -23,6 +24,7 @@ add_specs suite_builder =
|
||||
Common_Spec.add_specs suite_builder
|
||||
Integer_Overflow_Spec.add_specs suite_builder
|
||||
Lossy_Conversions_Spec.add_specs suite_builder
|
||||
Table_Conversion_Spec.add_specs suite_builder
|
||||
Table_Date_Spec.add_specs suite_builder
|
||||
Table_Date_Time_Spec.add_specs suite_builder
|
||||
Table_Time_Of_Day_Spec.add_specs suite_builder
|
||||
|
@ -16,6 +16,14 @@ add_specs suite_builder =
|
||||
expected = Table.from_rows ["foo", "bar 1", "bar 2", "bar 3"] expected_rows
|
||||
t2 = t.split_to_columns "bar" "|"
|
||||
t2.should_equal expected
|
||||
|
||||
group_builder.specify "can do split_to_columns by index" <|
|
||||
cols = [["foo", [0, 1, 2]], ["bar", ["a|c", "c|d|ef", "gh|ij|u"]]]
|
||||
t = Table.new cols
|
||||
expected_rows = [[0, "a", "c", Nothing], [1, "c", "d", "ef"], [2, "gh", "ij", "u"]]
|
||||
expected = Table.from_rows ["foo", "bar 1", "bar 2", "bar 3"] expected_rows
|
||||
t2 = t.split_to_columns 1 "|"
|
||||
t2.should_equal expected
|
||||
|
||||
group_builder.specify "can do split_to_columns where split character, first, last and only character" <|
|
||||
cols = [["foo", [0, 1, 2]], ["bar", ["|cb", "ab|", "|"]]]
|
||||
@ -41,6 +49,14 @@ add_specs suite_builder =
|
||||
t2 = t.split_to_rows "bar" "|"
|
||||
t2.should_equal expected
|
||||
|
||||
group_builder.specify "can do split_to_rows by index" <|
|
||||
cols = [["foo", [0, 1, 2]], ["bar", ["a|c", "c|d|ef", "gh|ij|u"]]]
|
||||
t = Table.new cols
|
||||
expected_rows = [[0, "a"], [0, "c"], [1, "c"], [1, "d"], [1, "ef"], [2, "gh"], [2, "ij"], [2, "u"]]
|
||||
expected = Table.from_rows ["foo", "bar"] expected_rows
|
||||
t2 = t.split_to_rows 1 "|"
|
||||
t2.should_equal expected
|
||||
|
||||
group_builder.specify "can do split_to_rows where split character, first, last and only character" <|
|
||||
cols = [["foo", [0, 1, 2]], ["bar", ["|cb", "ab|", "|"]]]
|
||||
t = Table.new cols
|
||||
@ -82,6 +98,14 @@ add_specs suite_builder =
|
||||
t2 = t.tokenize_to_columns "bar" "\d+"
|
||||
t2.should_equal expected
|
||||
|
||||
group_builder.specify "can do tokenize_to_columns by index" <|
|
||||
cols = [["foo", [0, 1, 2]], ["bar", ["a12b34r5", "23", "2r4r55"]]]
|
||||
t = Table.new cols
|
||||
expected_rows = [[0, "12", "34", "5"], [1, "23", Nothing, Nothing], [2, "2", "4", "55"]]
|
||||
expected = Table.from_rows ["foo", "bar 1", "bar 2", "bar 3"] expected_rows
|
||||
t2 = t.tokenize_to_columns 1 "\d+"
|
||||
t2.should_equal expected
|
||||
|
||||
group_builder.specify "can do tokenize_to_rows" <|
|
||||
cols = [["foo", [0, 1, 2]], ["bar", ["a12b34r5", "23", "2r4r55"]]]
|
||||
t = Table.new cols
|
||||
@ -90,6 +114,14 @@ add_specs suite_builder =
|
||||
t2 = t.tokenize_to_rows "bar" "\d+"
|
||||
t2.should_equal expected
|
||||
|
||||
group_builder.specify "can do tokenize_to_rows by index" <|
|
||||
cols = [["foo", [0, 1, 2]], ["bar", ["a12b34r5", "23", "2r4r55"]]]
|
||||
t = Table.new cols
|
||||
expected_rows = [[0, "12"], [0, "34"], [0, "5"], [1, "23"], [2, "2"], [2, "4"], [2, "55"]]
|
||||
expected = Table.from_rows ["foo", "bar"] expected_rows
|
||||
t2 = t.tokenize_to_rows 1 "\d+"
|
||||
t2.should_equal expected
|
||||
|
||||
group_builder.specify "can do tokenize_to_columns with some nothings" <|
|
||||
cols = [["foo", [0, 1, 2, 3]], ["bar", ["a12b34r5", Nothing, "23", "2r4r55"]]]
|
||||
t = Table.new cols
|
||||
@ -269,6 +301,12 @@ add_specs suite_builder =
|
||||
actual = t.parse_to_columns "bar" "(\d)(\d)"
|
||||
actual.should_equal expected
|
||||
|
||||
group_builder.specify "can parse to columns by index" <|
|
||||
t = Table.from_rows ["foo", "bar", "baz"] [["x", "12 34p q56", "y"], ["xx", "a48 59b", "yy"]]
|
||||
expected = Table.from_rows ["foo", "bar 1", "bar 2", "baz"] [["x", 1, 2, "y"], ["x", 3, 4, "y"], ["x", 5, 6, "y"], ["xx", 4, 8, "yy"], ["xx", 5, 9, "yy"]]
|
||||
actual = t.parse_to_columns 1 "(\d)(\d)"
|
||||
actual.should_equal expected
|
||||
|
||||
group_builder.specify "no regex groups" <|
|
||||
t = Table.from_rows ["foo", "bar", "baz"] [["x", "12 34p q56", "y"], ["xx", "a48 59b", "yy"]]
|
||||
expected = Table.from_rows ["foo", "bar", "baz"] [["x", 12, "y"], ["x", 34, "y"], ["x", 56, "y"], ["xx", 48, "yy"], ["xx", 59, "yy"]]
|
||||
|
@ -77,12 +77,12 @@ add_specs suite_builder =
|
||||
|
||||
suite_builder.group "from_objects with JSON (single values)" group_builder->
|
||||
group_builder.specify "Generates a single-row table from a JSON object" <|
|
||||
expected = Table.from_rows ["first", "last", "age"] [["Mary", "Smith", 23]]
|
||||
expected = Table.new [["Key", ["first", "last", "age"]], ["Value", ["Mary", "Smith", 23]]]
|
||||
Table.from_objects (data.uniform_json.at 0) . should_equal expected
|
||||
|
||||
group_builder.specify "works fine even if requested fields are duplicated" <|
|
||||
expected = Table.from_rows ["first", "last"] [["Mary", "Smith"]]
|
||||
Table.from_objects (data.uniform_json.at 0) ["first", "last", "first", "first"] . should_equal expected
|
||||
expected = Table.new [["Key", ["first", "last", "age"]], ["Value", ["Mary", "Smith", 23]]]
|
||||
Table.from_objects (data.uniform_json.at 0) ["Key", "Value", "Key", "Key"] . should_equal expected
|
||||
|
||||
suite_builder.group "from_objects with uniform JSON vector" group_builder->
|
||||
group_builder.specify "Generates a table from a vector of JSON objects" <|
|
||||
@ -159,9 +159,19 @@ add_specs suite_builder =
|
||||
suite_builder.group "expand_column" group_builder->
|
||||
group_builder.specify "Expands a column of single values" <|
|
||||
table = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb Value", [3, 4]], ["ccc", [5, 6]]]
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
|
||||
table.expand_column "bbb" . should_equal expected
|
||||
|
||||
group_builder.specify "Expands a column of single values by index" <|
|
||||
table = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
|
||||
table.expand_column 1 . should_equal expected
|
||||
|
||||
group_builder.specify "Expands a column of single values by index" <|
|
||||
table = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
|
||||
table.expand_column 1 . should_equal expected
|
||||
|
||||
group_builder.specify "Expands a uniform column of JSON objects" <|
|
||||
table = Table.new [["aaa", [1, 2]], ["bbb", data.uniform_json], ["ccc", [5, 6]]]
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb first", ["Mary", "Joe"]], ["bbb last", ["Smith", "Burton"]], ["bbb age", [23, 34]], ["ccc", [5, 6]]]
|
||||
@ -182,9 +192,19 @@ add_specs suite_builder =
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb last", ["Smith", Nothing]], ["bbb height", [Nothing, 1.9]], ["bbb foo", [Nothing, Nothing]], ["ccc", [5, 6]]]
|
||||
table.expand_column "bbb" ["last", "height", "foo"] . should_equal expected
|
||||
|
||||
group_builder.specify "accept vectors/arrays within a column" <|
|
||||
group_builder.specify "Expands vectors/arrays within a column" <|
|
||||
table = Table.new [["aaa", [1, 2]], ["bbb", [[1, 2, 3], [4, 5, 6].to_array]]]
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb Value", [[1, 2, 3], [4, 5, 6].to_array]]]
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb 0", [1, 4]], ["bbb 1", [2, 5]], ["bbb 2", [3, 6]]]
|
||||
table.expand_column "bbb" . should_equal expected
|
||||
|
||||
group_builder.specify "Expands ranges within a column" <|
|
||||
table = Table.new [["aaa", [1, 2]], ["bbb", [0.up_to 2, 3.up_to 5]]]
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb 0", [0, 3]], ["bbb 1", [1, 4]]]
|
||||
table.expand_column "bbb" . should_equal expected
|
||||
|
||||
group_builder.specify "Expands date ranges within a column" <|
|
||||
table = Table.new [["aaa", [1, 2]], ["bbb", [Date.new 2020 12 1 . up_to (Date.new 2020 12 3), Date.new 2022 12 1 . up_to (Date.new 2022 12 2)]]]
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb 0", [Date.new 2020 12 1, Date.new 2022 12 1]], ["bbb 1", [Date.new 2020 12 2, Nothing]]]
|
||||
table.expand_column "bbb" . should_equal expected
|
||||
|
||||
group_builder.specify "will work even if keys are not Text" <|
|
||||
@ -214,7 +234,7 @@ add_specs suite_builder =
|
||||
table = Table.new [["aaa", [1, 2]], ["bbb", [Map.from_vector [], Map.from_vector []]], ["ccc", [5, 6]]]
|
||||
r = table.expand_column "bbb"
|
||||
r.should_fail_with Illegal_Argument
|
||||
r.catch.message.should_contain "all input objects had no fields"
|
||||
r.catch.message.should_contain "as all inputs had no fields"
|
||||
|
||||
group_builder.specify "will error when fields=[]" <|
|
||||
table = Table.new [["aaa", [1, 2]], ["bbb", data.uniform_json], ["ccc", [5, 6]]]
|
||||
@ -249,6 +269,12 @@ add_specs suite_builder =
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
|
||||
table.expand_to_rows "bbb" . should_equal expected
|
||||
|
||||
group_builder.specify "Can expand single values by index" <|
|
||||
values_to_expand = [3, 4]
|
||||
table = Table.new [["aaa", [1, 2]], ["bbb", values_to_expand], ["ccc", [5, 6]]]
|
||||
expected = Table.new [["aaa", [1, 2]], ["bbb", [3, 4]], ["ccc", [5, 6]]]
|
||||
table.expand_to_rows 1 . should_equal expected
|
||||
|
||||
group_builder.specify "Can expand Vectors" <|
|
||||
values_to_expand = [[10, 11], [20, 21, 22], [30]]
|
||||
table = Table.new [["aaa", [1, 2, 3]], ["bbb", values_to_expand], ["ccc", [5, 6, 7]]]
|
||||
@ -270,9 +296,9 @@ add_specs suite_builder =
|
||||
table.expand_to_rows "bbb" . should_equal expected
|
||||
|
||||
group_builder.specify "Can expand Pairs" <|
|
||||
values_to_expand = [Pair.new 10 20, Pair.new "a" [30], Pair.new 40 50]
|
||||
values_to_expand = [Pair.new 10 20, Pair.new "a" 30, Pair.new 40 50]
|
||||
table = Table.new [["aaa", [1, 2, 3]], ["bbb", values_to_expand], ["ccc", [5, 6, 7]]]
|
||||
expected = Table.new [["aaa", [1, 1, 2, 2, 3, 3]], ["bbb", [10, 20, "a", [30], 40, 50]], ["ccc", [5, 5, 6, 6, 7, 7]]]
|
||||
expected = Table.new [["aaa", [1, 1, 2, 2, 3, 3]], ["bbb", [10, 20, "a", 30, 40, 50]], ["ccc", [5, 5, 6, 6, 7, 7]]]
|
||||
table.expand_to_rows "bbb" . should_equal expected
|
||||
|
||||
group_builder.specify "Can expand Ranges" <|
|
||||
@ -291,6 +317,18 @@ add_specs suite_builder =
|
||||
expected = Table.new [["aaa", [1, 1, 2, 2, 2, 3, 3, 3]], ["bbb", values_expanded], ["ccc", [5, 5, 6, 6, 6, 7, 7, 7]]]
|
||||
table.expand_to_rows "bbb" . should_equal expected
|
||||
|
||||
group_builder.specify "Can expand Map" <|
|
||||
values_to_expand = [Map.empty.insert "a" 10, Map.empty.insert "d" 40 . insert "b" 20, Map.empty.insert "c" 30]
|
||||
table = Table.new [["aaa", [1, 2, 3]], ["bbb", values_to_expand], ["ccc", [5, 6, 7]]]
|
||||
expected = Table.new [["aaa", [1, 2, 2, 3]], ["bbb Key", ["a", "d", "b", "c"]], ["bbb", [10, 40, 20, 30]], ["ccc", [5, 6, 6, 7]]]
|
||||
table.expand_to_rows "bbb" . should_equal expected
|
||||
|
||||
group_builder.specify "Can expand JS_Object" <|
|
||||
values_to_expand = ['{"a": 10}'.parse_json, '{"b": 20, "d": 40}'.parse_json, '{"c": 30}'.parse_json]
|
||||
table = Table.new [["aaa", [1, 2, 3]], ["bbb", values_to_expand], ["ccc", [5, 6, 7]]]
|
||||
expected = Table.new [["aaa", [1, 2, 2, 3]], ["bbb Key", ["a", "b", "d", "c"]], ["bbb", [10, 20, 40, 30]], ["ccc", [5, 6, 6, 7]]]
|
||||
table.expand_to_rows "bbb" . should_equal expected
|
||||
|
||||
group_builder.specify "Can expand mixed columns" <|
|
||||
values_to_expand = [[10, 11], 22.up_to 26, (Date.new 2020 02 28).up_to (Date.new 2020 03 01)]
|
||||
values_expanded = [10, 11, 22, 23, 24, 25, Date.new 2020 02 28, Date.new 2020 02 29]
|
||||
@ -345,7 +383,7 @@ add_specs suite_builder =
|
||||
t.at "Name" . to_vector . should_equal (Vector.fill 5 "Library")
|
||||
t.at "@catalog" . to_vector . should_equal (Vector.fill 5 "Fiction")
|
||||
t.at "@letter" . to_vector . should_equal (Vector.fill 5 "A")
|
||||
t.at "Children Value" . to_vector . map trim_if_text . should_equal ["Hello", "My Book", "World", "Your Book", "Cheap Cars For You"]
|
||||
t.at "Children" . to_vector . map trim_if_text . should_equal ["Hello", "My Book", "World", "Your Book", "Cheap Cars For You"]
|
||||
t.at "Children Name" . to_vector . map trim_if_text . should_equal [Nothing, "Book", Nothing, "Book", "Magazine"]
|
||||
t.at "Children @author" . to_vector . map trim_if_text . should_equal [Nothing, "An Author", Nothing, "Another Author", Nothing]
|
||||
t.at "Children @month" . to_vector . map trim_if_text . should_equal [Nothing, Nothing, Nothing, Nothing, 'August-2023']
|
||||
|
@ -1,2 +1 @@
|
||||
#license
|
||||
/FasterXML/jackson-dataformats-binary/blob/2.17/LICENSE
|
||||
|
@ -0,0 +1 @@
|
||||
META-INF/LICENSE
|
@ -0,0 +1 @@
|
||||
META-INF/NOTICE
|
@ -0,0 +1,2 @@
|
||||
Copyright (c) 2007- Tatu Saloranta, tatu.saloranta@iki.fi
|
||||
Copyright 2018-2020 Raffaello Giulietti
|
@ -0,0 +1,2 @@
|
||||
META-INF/LICENSE
|
||||
META-INF/jackson-core-LICENSE
|
@ -0,0 +1 @@
|
||||
META-INF/jackson-core-NOTICE
|
@ -0,0 +1,2 @@
|
||||
Copyright 2010 Google Inc. All Rights Reserved.
|
||||
Copyright 2011 Google Inc. All Rights Reserved.
|
@ -0,0 +1 @@
|
||||
META-INF/LICENSE
|
@ -0,0 +1 @@
|
||||
META-INF/NOTICE
|
@ -1,3 +1,3 @@
|
||||
58F42EA238F4F16E775412B67F584C74188267FB305705B57A50E10124FE56BC
|
||||
DC3F2E51015236DC72560E5DD29B13156A2244C6753828B6D0848683018D5ABA
|
||||
44A2EB4467C91025C305D370F3E8C9430A69FCD957630A539385AB785B0A1C6D
|
||||
5F5974B8673A2E82B0148235CCE1FC0DD9FB8D3ED9C9552A4D86A4EE14723DE5
|
||||
0
|
||||
|
@ -1 +1 @@
|
||||
/com-lihaoyi/fansi/blob/master/LICENSE
|
||||
/com-lihaoyi/Fansi/blob/master/LICENSE
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe-generic-extras/blob/main/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1 +0,0 @@
|
||||
#license
|
@ -1 +0,0 @@
|
||||
#license
|
@ -1 +1 @@
|
||||
/Philippus/bump/blob/main/LICENSE.md
|
||||
/philippus/bump/blob/main/LICENSE.md
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1 +0,0 @@
|
||||
#license
|
@ -1 +0,0 @@
|
||||
#license
|
@ -1 +1 @@
|
||||
/Philippus/bump/blob/main/LICENSE.md
|
||||
/philippus/bump/blob/main/LICENSE.md
|
||||
|
@ -1,3 +1,2 @@
|
||||
/pureconfig/pureconfig/blob/master/AUTHORS
|
||||
/pureconfig/pureconfig/blob/master/LICENSE
|
||||
#license
|
||||
|
@ -1,3 +1,2 @@
|
||||
/pureconfig/pureconfig/blob/master/AUTHORS
|
||||
/pureconfig/pureconfig/blob/master/LICENSE
|
||||
#license
|
||||
|
@ -1,3 +1,2 @@
|
||||
/pureconfig/pureconfig/blob/master/AUTHORS
|
||||
/pureconfig/pureconfig/blob/master/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
#license
|
||||
/pureconfig/pureconfig/blob/master/LICENSE
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1,2 +1 @@
|
||||
/circe/circe/blob/series/0.14.x/LICENSE
|
||||
#license
|
||||
|
@ -1 +0,0 @@
|
||||
#license
|
@ -1 +0,0 @@
|
||||
#license
|
@ -1 +1 @@
|
||||
/Philippus/bump/blob/main/LICENSE.md
|
||||
/philippus/bump/blob/main/LICENSE.md
|
||||
|
Loading…
Reference in New Issue
Block a user