Libraries Repository (#1804)

This commit is contained in:
Radosław Waśko 2021-06-22 13:35:15 +02:00 committed by GitHub
parent 6950888bb0
commit 1d124d7770
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
20 changed files with 725 additions and 35 deletions

View File

@ -51,6 +51,8 @@
`default` engine version, which may be unexpected. Ideally, after migration,
the project should be used only with the new tools. The affected tools are the
Launcher and the Project Manager.
- Added documentation and a minimal tool for hosting custom library repositories
([#1804](https://github.com/enso-org/enso/pull/1804)).
## Libraries

View File

@ -1219,6 +1219,7 @@ lazy val editions = project
"org.scalatest" %% "scalatest" % scalatestVersion % Test
)
)
.dependsOn(testkit % Test)
lazy val `library-manager` = project
.in(file("lib/scala/library-manager"))

View File

@ -12,3 +12,5 @@ Documents in this section describe Enso's library ecosystem.
- [**Editions:**](./editions.md) Information on Editions, the concept that
organizes library versioning.
- [**Repositories:**](./repositories.md) Information on the structure of
repositories providing Enso libraries and Editions.

View File

@ -0,0 +1,303 @@
---
layout: developer-doc
title: Repositories
category: libraries
tags: [repositories, libraries, editions]
order: 2
---
# Editions
This document describes the format of repositories that are providing Enso
libraries and Editions.
<!-- MarkdownTOC levels="2,3" autolink="true" -->
- [General Repository Design](#general-repository-design)
- [Libraries Repository](#libraries-repository)
- [The Manifest File](#the-manifest-file)
- [The Sub-Archives](#the-sub-archives)
- [Example](#example-1)
- [Editions Repository](#editions-repository)
- [Naming the Editions](#naming-the-editions)
- [Example Edition Provider Repository](#example-edition-provider-repository)
<!-- /MarkdownTOC -->
## General Repository Design
Library and Edition providers are based on the HTTP(S) protocol. They are
designed in such a way that they can be backed by a simple file storage exposed
over the HTTP protocol, but of course it is also possible to implement the
backend differently as long as it conforms to the specification.
It is recommended that the server should support sending text files with
`Content-Encoding: gzip` to more effectively transmit the larger manifest files,
but this is optional and sending them without compression is also acceptable.
Nonetheless, Enso tools will send `Accept-Encoding: gzip` to indicate that they
support compressed transmission.
## Libraries Repository
The library repository should contain separate 'directories' for each prefix and
inside of them each library has its own directory named after the library name.
Inside that directory are the following files:
- `manifest.yaml` - the helper file that tells the tool what it should download,
it is explained in more detail below;
- `package.yaml` -
[the package file](../distribution/packaging.md#the-packageyaml-file) of the
library;
- `meta` - an optional metadata directory, that may be used by the marketplace;
- `LICENSE.md` - a license associated with the library; in our official
repository the license is required, but internal company repositories may skip
this, however if the file is not present a warning will be emitted during
installation;
- `*.tgz` packages containing sources and other data files of the library, split
into components as explained below.
The directory structure is as below:
```
root
└── Prefix # The author's username.
└── Library_Name # The name of the library.
└── 1.2.3 # Version of a particular library package.
├── meta # (Optional) Library metadata for display in the marketplace.
│ ├── preview.png
│ └── icon.png
├── main.tgz # The compressed package containing sources of the library.
├── tests.tgz # A package containing the test sources.
├── LICENSE.md
├── package.yaml
└── manifest.yaml
```
### The Manifest File
The manifest file is a YAML file with the following fields:
- `archives` - a list of archive names that are available for the given library;
at least one archive must be present (as otherwise the package would be
completely empty);
- `dependencies` - a list of dependencies, as described below;
- `description` - an optional description of the library that is displayed in
the `info` and `search` command results;
- `tag-line` - an optional tagline that will be displayed in the marketplace
interface.
As the protocol does not define a common way of listing directories, the primary
purpose of the manifest file is to list the available archive packages, so that
the downloader can know what archives it should try downloading.
Additionally, the manifest may contain a list of (direct) dependencies the
library relies on. This list is in a way redundant, because the dependencies may
be inferred from libraries' imports, but its presence is desirable, because the
downloader would need to download the whole sources package (which may be large)
before being able to deduce the dependencies, where if they are defined in the
manifest file, the manifest files of all transitive dependencies may be
downloaded up-front, allowing to give a better estimate of how much must be
downloaded before the library can be actually loaded, improving the user's
experience.
It is not an error for an imported dependency to not be included in the manifest
(in fact the manifest may list no dependencies at all) - in such a case the
dependency will be downloaded when the library is first being loaded. However,
it is strongly recommended that these dependencies shall be included, as it
greatly improves the user's experience.
The dependencies consist only of library names, with no version numbers, as the
particular version of each dependency that should be used will be ruled by the
edition that is used in a given project.
> The upload tool will automatically parse the imports and generate the manifest
> containing the dependencies.
#### Example
An example `manifest.yaml` file may have the following structure:
```yaml
archives:
- main.tgz
- tests.tgz
dependencies:
- Standard.Base
- Foo.Bar
```
### The Sub-Archives
The published library consists of sub-archives that allow to selectively
download only parts of the library.
Each downloaded archive will be extracted to the libraries' root directory in
such a way that common directories from multiple archives are merged on
extraction. However, different packages should not contain overlapping files as
there would be no way which of the files should be kept when the packages are
extracted.
> It is not an error if multiple downloaded packages contain conflicting files,
> but there are no guarantees as to which of the conflicting files is kept.
The package called `tests` is treated specially - it will not be downloaded by
default, as tests (which may contain heavy data files) are not necessary to use
the library.
> In the future, we will introduce platform specific sub-archives. The initial
> idea is that if an archive name has format `<prefix>-<os>-<arch>.tgz` where
> `os` is one of `windows`, `macos`, `linux` and `arch` is `amd64` (or in the
> future other values may be available here), the package is only downloaded if
> the current machine is running the same OS and architecture as indicated by
> its name. This is however a draft and the particular logic may be modified
> before it is implemented. Since the current behaviour is to download all
> packages (except for `test`), adding this feature will be backwards
> compatible, because the older versions will just download packages for every
> system (which will be unnecessary, but not incorrect).
All other packages are always downloaded by default. This may however change in
the future with additional reserved names with special behaviour being added.
There is no special name for a default package that should always be downloaded,
the only requirement is that the library should consist of at least one package
that is downloaded on every supported operating system (as otherwise it would be
empty). A safe name to choose is `main.tgz` as this name is guaranteed to never
become reserved, and so it will always be downloaded.
The archives should be `tar` archives compressed with the `gzip` algorithm and
should always have the `.tgz` extension (which is a shorthand for `.tar.gz`).
> Other formats may be added in the future if necessary, but current versions of
> the tool will ignore such files.
### Example
For example a library may have the following manifest:
```
archives:
- main.tgz
- tests.tgz
```
With the following directory structure (nodes under archives represent what the
archive contains):
```
root/Foo/Bar/1.2.3
├── main.tgz
│ ├── src
│ │ ├── Main.enso
│ │ └── Foo.enso
│ ├── polyglot
│ │ └── java
│ │ └── native-helper.jar
│ ├── THIRD-PARTY
│ │ ├── native-component-license.txt
│ │ └── native-component-distribution-notice.txt
│ └── data
│ └── required-constants.csv
├── tests.tgz
│ ├── tests
│ │ └── MainSpec.enso
│ └── data
│ └── tests
│ └── big-test-data.csv
├── package.yaml
├── LICENSE.md
└── manifest.yaml
```
Then if both `maing.tgz` and `tests.tgz` packages are downloaded (normally we
don't download the tests, but there may be special settings that do download
them), it will result in the following merged directory structure:
```
<downloaded-libraries-cache>/Foo/Bar/1.2.3
├── src
│ ├── Main.enso
│ └── Foo.enso
├── polyglot
│ └── java
│ └── native-helper.jar
├── THIRD-PARTY
│ ├── native-component-license.txt
│ └── native-component-distribution-notice.txt
├── tests
│ └── MainSpec.enso
├── data
│ ├── required-constants.csv
│ └── tests
│ └── big-test-data.csv
├── package.yaml
└── LICENSE.md
```
## Editions Repository
The Editions repository has a very simple structure.
Firstly, it must contain a `manifest.yaml` file at its root. The manifest
contains a single field `editions` which is a list of strings specifying the
editions that this provider provides.
For each entry in the manifest, there should be a file `<edition-name>.yaml` at
the root which corresponds to that entry.
### Naming the Editions
The edition files are supposed to be immutable, so once published, an edition
should not be updated - instead a new edition should be created if changes are
necessary. In particular, once an edition with a particular name has been
downloaded, it is cached and will never be downloaded again (unless the user
manually deletes its file in the cache).
The edition names should be kept unique, because if multiple repositories
(listed in [the global configuration](./editions.md#updating-the-editions))
provide editions with the same name, the edition file from the first repository
on that list providing it will take precedence when the editions are being
updated, but once the editions are cached, modifying the list order will not
cause a re-download.
Each organization should try to make sure that their users will not encounter
edition names conflicts when using their custom edition repository. In
particular, it is recommended that custom published editions are prefixed with
organization name.
Official editions will use the following sets of names:
- the year and month format `<year>.<month>`, for example `2021.4`;
- `nightly-<year>-<month>-<day>` for nightly releases, for example
`nightly-2021-04-25`.
### Example Edition Provider Repository
For example for a manifest file with the following contents, we will have a
directory structure as shown below.
```yaml
editions:
- "2021.1"
- "foo"
- "bar"
```
```
root
├── manifest.yaml
├── 2021.1.yaml
├── foo.yaml
└── bar.yaml
```
## The Simple Library Server
We provide a simple webserver for hosting custom library and edition
repositories.
Currently it relies on Node.js, but that may change with future updates.
See
[`tools/simple-library-server/README.md`](../../tools/simple-library-server/README.md)
for more details.

View File

@ -0,0 +1,28 @@
package org.enso.editions
import io.circe.Decoder
/** A helper type to handle special parsing logic of edition names.
*
* The issue is that if an edition is called `2021.4` and it is written
* unquoted inside of a YAML file, that is treated as a floating point
* number, so special care must be taken to correctly parse it.
*/
case class EditionName(name: String) extends AnyVal
object EditionName {
/** A helper method for constructing an [[EditionName]]. */
def apply(name: String): EditionName = new EditionName(name)
/** A [[Decoder]] instance for [[EditionName]] that accepts not only strings
* but also numbers as valid edition names.
*/
implicit val editionNameDecoder: Decoder[EditionName] = { json =>
json
.as[String]
.orElse(json.as[Int].map(_.toString))
.orElse(json.as[Float].map(_.toString))
.map(EditionName(_))
}
}

View File

@ -1,6 +1,6 @@
package org.enso.editions
import cats.Show
import org.enso.yaml.ParseError
/** Indicates an error during resolution of a raw edition. */
sealed class EditionResolutionError(message: String, cause: Throwable = null)
@ -18,8 +18,11 @@ object EditionResolutionError {
)
/** Indicates that the edition cannot be parsed. */
case class EditionParseError(message: String, cause: Throwable)
extends EditionResolutionError(message, cause)
case class EditionParseError(cause: Throwable)
extends EditionResolutionError(
s"Cannot parse the edition: ${cause.getMessage}",
cause
)
/** Indicates that a library defined in an edition references a repository
* that is not defined in that edition or any of its parents, and so such a
@ -41,16 +44,6 @@ object EditionResolutionError {
s"Edition resolution encountered a cycle: ${editions.mkString(" -> ")}"
)
/** Wraps a Circe's decoding error into a more user-friendly error. */
def wrapDecodingError(decodingError: io.circe.Error): EditionParseError = {
val errorMessage =
implicitly[Show[io.circe.Error]].show(decodingError)
EditionParseError(
s"Could not parse the edition: $errorMessage",
decodingError
)
}
/** Wraps a general error thrown when loading a parsing an edition into a more
* specific error type.
*/
@ -59,7 +52,7 @@ object EditionResolutionError {
throwable: Throwable
): EditionResolutionError =
throwable match {
case decodingError: io.circe.Error => wrapDecodingError(decodingError)
case other => CannotLoadEdition(editionName, other)
case error: ParseError => EditionParseError(error)
case other => CannotLoadEdition(editionName, other)
}
}

View File

@ -6,6 +6,7 @@ import io.circe._
import nl.gn0s1s.bump.SemVer
import org.enso.editions.Editions.{Raw, Repository}
import org.enso.editions.SemVerJson._
import org.enso.yaml.YamlHelper
import java.io.FileReader
import java.net.URL
@ -17,11 +18,10 @@ object EditionSerialization {
/** Tries to parse an edition definition from a string in the YAML format. */
def parseYamlString(yamlString: String): Try[Raw.Edition] =
yaml.parser
.parse(yamlString)
.flatMap(_.as[Raw.Edition])
YamlHelper
.parseString[Raw.Edition](yamlString)
.left
.map(EditionResolutionError.wrapDecodingError)
.map(EditionResolutionError.EditionParseError)
.toTry
/** Tries to load an edition definition from a YAML file. */
@ -157,22 +157,6 @@ object EditionSerialization {
}
}
/** A helper opaque type to handle special parsing logic of edition names.
*
* The issue is that if an edition is called `2021.4` and it is written
* unquoted inside of a YAML file, that is treated as a floating point
* number, so special care must be taken to correctly parse it.
*/
private case class EditionName(name: String)
implicit private val editionNameDecoder: Decoder[EditionName] = { json =>
json
.as[String]
.orElse(json.as[Int].map(_.toString))
.orElse(json.as[Float].map(_.toString))
.map(EditionName)
}
implicit private val libraryDecoder: Decoder[Raw.Library] = { json =>
def makeLibrary(name: String, repository: String, version: Option[SemVer]) =
if (repository == Fields.localRepositoryName)

View File

@ -1,5 +1,7 @@
package org.enso.editions
import io.circe.{Decoder, DecodingFailure}
/** Represents a library name that should uniquely identify the library.
*
* The prefix is either a special prefix or a username.
@ -14,3 +16,27 @@ case class LibraryName(prefix: String, name: String) {
/** @inheritdoc */
override def toString: String = qualifiedName
}
object LibraryName {
/** A [[Decoder]] instance allowing to parse a [[LibraryName]]. */
implicit val decoder: Decoder[LibraryName] = { json =>
for {
str <- json.as[String]
name <- fromString(str).left.map { errorMessage =>
DecodingFailure(errorMessage, json.history)
}
} yield name
}
/** Creates a [[LibraryName]] from its string representation.
*
* Returns an error message on failure.
*/
def fromString(str: String): Either[String, LibraryName] = {
str.split('.') match {
case Array(prefix, name) => Right(LibraryName(prefix, name))
case _ => Left(s"`$str` is not a valid library name.")
}
}
}

View File

@ -0,0 +1,22 @@
package org.enso.editions.repository
import io.circe._
import org.enso.editions.EditionName
/** The Edition Repository manifest, which lists all editions that the
* repository provides.
*/
case class Manifest(editions: Seq[EditionName])
object Manifest {
object Fields {
val editions = "editions"
}
/** A [[Decoder]] instance for parsing [[Manifest]]. */
implicit val decoder: Decoder[Manifest] = { json =>
for {
editions <- json.get[Seq[EditionName]](Fields.editions)
} yield Manifest(editions)
}
}

View File

@ -0,0 +1,19 @@
package org.enso.yaml
import cats.Show
/** Indicates a parse failure, usually meaning that the input data has
* unexpected format (like missing fields or wrong field types).
*/
case class ParseError(message: String, cause: io.circe.Error)
extends RuntimeException(message, cause)
object ParseError {
/** Wraps a [[io.circe.Error]] into a more user-friendly [[ParseError]]. */
def apply(error: io.circe.Error): ParseError = {
val errorMessage =
implicitly[Show[io.circe.Error]].show(error)
ParseError(errorMessage, error)
}
}

View File

@ -0,0 +1,17 @@
package org.enso.yaml
import io.circe.{yaml, Decoder}
/** A helper for parsing YAML configs. */
object YamlHelper {
/** Parses a string representation of a YAML configuration of type `R`. */
def parseString[R](
yamlString: String
)(implicit decoder: Decoder[R]): Either[ParseError, R] =
yaml.parser
.parse(yamlString)
.flatMap(_.as[R])
.left
.map(ParseError(_))
}

View File

@ -0,0 +1,32 @@
package org.enso.editions
import org.enso.testkit.EitherValue
import org.enso.yaml.YamlHelper
import org.scalatest.EitherValues
import org.scalatest.matchers.should.Matchers
import org.scalatest.wordspec.AnyWordSpec
class LibraryNameSpec
extends AnyWordSpec
with Matchers
with EitherValue
with EitherValues {
"LibraryName" should {
"parse and serialize to the same thing" in {
val str = "Foo.Bar"
val libraryName = LibraryName.fromString(str).rightValue
libraryName.qualifiedName shouldEqual str
libraryName.name shouldEqual "Bar"
libraryName.prefix shouldEqual "Foo"
val yamlParsed = YamlHelper.parseString[LibraryName](str).rightValue
yamlParsed shouldEqual libraryName
}
"fail to parse if there are too many parts" in {
LibraryName.fromString("A.B.C") shouldEqual Left(
"`A.B.C` is not a valid library name."
)
}
}
}

View File

@ -0,0 +1,25 @@
package org.enso.editions.repository
import org.enso.editions.EditionName
import org.enso.testkit.EitherValue
import org.enso.yaml.YamlHelper
import org.scalatest.matchers.should.Matchers
import org.scalatest.wordspec.AnyWordSpec
class ManifestParserSpec extends AnyWordSpec with Matchers with EitherValue {
"Manifest" should {
"be parsed from YAML format" in {
val str =
"""editions:
|- foo
|- 2021.4
|- bar
|""".stripMargin
YamlHelper.parseString[Manifest](str).rightValue.editions shouldEqual Seq(
EditionName("foo"),
EditionName("2021.4"),
EditionName("bar")
)
}
}
}

View File

@ -0,0 +1,45 @@
package org.enso.librarymanager.published.repository
import io.circe.Decoder
import org.enso.editions.LibraryName
/** The manifest file containing metadata related to a published library.
*
* @param archives sequence of sub-archives that the library package is
* composed of
* @param dependencies sequence of direct dependencies of the library
* @param tagLine a short description of the library
* @param description a longer description of the library, for the Marketplace
*/
case class LibraryManifest(
archives: Seq[String],
dependencies: Seq[LibraryName],
tagLine: Option[String],
description: Option[String]
)
object LibraryManifest {
object Fields {
val archives = "archives"
val dependencies = "dependencies"
val tagLine = "tag-line"
val description = "description"
}
/** A [[Decoder]] instance for parsing [[LibraryManifest]]. */
implicit val decoder: Decoder[LibraryManifest] = { json =>
for {
archives <- json.get[Seq[String]](Fields.archives)
dependencies <- json.getOrElse[Seq[LibraryName]](Fields.dependencies)(
Seq()
)
tagLine <- json.get[Option[String]](Fields.tagLine)
description <- json.get[Option[String]](Fields.description)
} yield LibraryManifest(
archives = archives,
dependencies = dependencies,
tagLine = tagLine,
description = description
)
}
}

View File

@ -0,0 +1,53 @@
package org.enso.librarymanager.published.repository
import org.enso.editions.LibraryName
import org.enso.testkit.EitherValue
import org.enso.yaml.YamlHelper
import org.scalatest.matchers.should.Matchers
import org.scalatest.wordspec.AnyWordSpec
class LibraryManifestParserSpec
extends AnyWordSpec
with Matchers
with EitherValue {
"LibraryManifest" should {
"be parsed from YAML format" in {
val str =
"""archives:
|- main.tgz
|- tests.tgz
|dependencies:
|- Standard.Base
|- Foo.Bar
|tag-line: Foo Bar
|description: Foo bar baz.
|""".stripMargin
YamlHelper
.parseString[LibraryManifest](str)
.rightValue shouldEqual LibraryManifest(
archives = Seq("main.tgz", "tests.tgz"),
dependencies = Seq(
LibraryName.fromString("Standard.Base").rightValue,
LibraryName.fromString("Foo.Bar").rightValue
),
tagLine = Some("Foo Bar"),
description = Some("Foo bar baz.")
)
}
"require only a minimal set of fields to parse" in {
val str =
"""archives:
|- main.tgz
|""".stripMargin
YamlHelper
.parseString[LibraryManifest](str)
.rightValue shouldEqual LibraryManifest(
archives = Seq("main.tgz"),
dependencies = Seq(),
tagLine = None,
description = None
)
}
}
}

View File

@ -157,6 +157,8 @@ object HTTPDownload {
sink: Sink[ByteString, Future[A]],
resultMapping: (HttpResponse, A) => B
): TaskProgress[B] = {
// TODO [RW] Add optional stream encoding allowing for compression -
// add headers and decode the stream if necessary (#1805).
val taskProgress = new TaskProgressImplementation[B](ProgressUnit.Bytes)
val total = new java.util.concurrent.atomic.AtomicLong(0)
import actorSystem.dispatcher

View File

@ -0,0 +1,18 @@
package org.enso.runtimeversionmanager.http
import org.scalatest.matchers.should.Matchers
import org.scalatest.wordspec.AnyWordSpec
class HTTPDownloadSpec extends AnyWordSpec with Matchers {
"HTTPDownload" should {
"accept gzipped responses and decode them correctly" in {
// TODO [RW] Write the test using the simple-library-server (#1805)
/** It should:
* - generate a 2kb yaml file
* - run the server
* - download the file and verify its contents
* - if possible, check that it was indeed compressed
*/
}
}
}

View File

@ -0,0 +1,57 @@
# Simple Enso Library Server
A simple server for hosting custom Enso library repositories.
## Usage
You need Node.JS to use this version of the server.
To install the dependencies, run `npm install` in the root directory of the
application. Then you can run the `main.js` file to start the server. See
`./main.js --help` for available commandline options.
## Repository structure
When launching the server, you need to provide it with a directory that is the
root of the server. This directory should contain a `libraries` directory or
`editions` directory (or both). Each of them should have the directory structure
as described in [the Enso documentation](../../docs/libraries/repositories.md).
For example, the root directory may look like this:
```
root
├── libraries
│ ├── Foo
│ │ └── Bar
│ │ ├── 1.2.3
│ │ │ ├── meta
│ │ │ │ ├── preview.png
│ │ │ │ └── icon.png
│ │ │ ├── main.tgz
│ │ │ ├── tests.tgz
│ │ │ ├── LICENSE.md
│ │ │ ├── package.yaml
│ │ │ └── manifest.yaml
│ │ └── 1.2.4-SNAPSHOT.2021-06-24
│ │ └── ... # Truncated for clarity
│ └── Standard
│ ├── Base
│ │ └── 1.0.0
│ │ └── ...
│ └── Table
│ └── 1.0.0
│ └── ...
└── editions
├── manifest.yaml
├── 2021.1.yaml
├── foo.yaml
└── bar.yaml
```
Then to add this repository as an edition provider you can append
`http://hostname:port/editions` to the `edition-providers` field in
`global-config.yaml`.
To use libraries from this repository, the editions should define the repository
with URL `http://hostname:port/libraries`.

View File

@ -0,0 +1,40 @@
#!/usr/bin/env node
const express = require("express");
const compression = require("compression");
const yargs = require("yargs");
const argv = yargs
.usage(
"$0",
"Allows to host Enso libraries and editions from the local filesystem through HTTP."
)
.option("port", {
description: "The port to listen on.",
type: "number",
default: 8080,
})
.option("root", {
description:
"The root of the repository. It should contain a `libraries` or `editions` directory. See the documentation for more details.",
type: "string",
default: ".",
})
.help()
.alias("help", "h").argv;
console.log(
`Serving the repository located under ${argv.root} on port ${argv.port}.`
);
const app = express();
app.use(compression({ filter: shouldCompress }));
app.use(express.static(argv.root));
app.listen(argv.port);
function shouldCompress(req, res) {
if (req.path.endsWith(".yaml")) {
return true;
}
return compression.filter(req, res);
}

View File

@ -0,0 +1,21 @@
{
"name": "simple-library-server",
"version": "1.0.0",
"description": "A simple server for hosting Enso libraries and Editions.",
"main": "main.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [
"enso",
"libraries",
"server"
],
"author": "Enso Team",
"license": "Apache-2.0",
"dependencies": {
"compression": "^1.7.4",
"express": "^4.17.1",
"yargs": "^17.0.1"
}
}