20 KiB
The reason we keep track of dependents is for the `todo` calculation. When we make an edit, what are the things that need to be updated as a result? When adding term `a` that depends on "derived" term `b` or type `B`, then a change to `b` or `B` affects `a`, so we record that `a` is a dependent of `b` and `B`. When adding type `A` that depends on type `B`, a change to `B` affects `A`, so we record that `A` is a dependent of `B`. We don't do anything for constructors, because constructors don't change. Depending on the constructor really means you depend on the type that constructor comes from. (i.e. a constructor doesn't have dependents.) Similarly, constructor doesn't have dependencies, but its declaring type may depend on other types.
todo
calculation. When we make an edit, what are the things that need to be updated as a result?
When adding term a
that depends on "derived" term b
or type B
, then a change to b
or B
affects a
, so we record that a
is a dependent of b
and B
.
When adding type A
that depends on type B
, a change to B
affects A
, so we record that A
is a dependent of B
.
We don't do anything for constructors, because constructors don't change. Depending on the constructor really means you depend on the type that constructor comes from. (i.e. a constructor doesn't have dependents.) Similarly, constructor doesn't have dependencies, but its declaring type may depend on other types.
Commands
/> cd libs/Foo
/libs/Foo> cd ..
/libs> fork Foo Foo2
/libs> fork <someurl> thing
/libs> fork Foo /outside/Foo
/libs> fork /outside/Foo /outside/Foo2
/libs> help merge
`> merge src dest`
/libs> merge /outside/Foo Foo
/libs> merge Foo2 Foo
/libs/Foo> <work work work>
/libs> move /libs/Foo /libs/Foo'
/libs>
A.B.c
A.B.d
arya renames, and has: ->
A.Z.c
A.Z.d
paul adds, and has ->
A.B.e
A.B.c
A.B.d
then merge ->
"Merge introduces the following aliases:"
A.Z.c -> A.B.c
A.Z.d -> A.B.d
/libs> delete /libs/Foo
"warning: /libs/Foo includes the following definitions that aren't anywhere else:
A.B.e#123
run it again to proceed with deletion"
/libs> alias /libs/Foo/sqrt /libs/Foo2/butt
-- we talked about combining alias & fork into a single "copy" command
/libs>
Weird thing: There's no history for sqrt
!
Suppose:
data Raw = Raw
{ _termsR :: Set Referent
, _typesR :: Set Reference
, _childrenR :: Map NameSegment Hash
}
/libfoo/Foo <- type
/libfoo/Foo <- constructor
/libfoo/Foo.f <- term in child namespace
/libfoo> move Foo Foo2
/libfoo> alias Foo Foo2
Data types:
Old PrettyPrintEnv is for pretty-pretting code, and ___
{ terms :: Referent -> Maybe HashQualified
, types :: Reference -> Maybe HashQualified }
Q: How do we want to handle lookup of names that are outside of our branch?
Old Namespace
{ _terms :: Relation Name Referent
, _types :: Relation Name Reference }
Old Names is an unconflicted Namespace. is for parsing code? Not sufficient to parse hash-qualified names.
{ termNames :: Map Name Referent
, typeNames :: Map Name Reference }
New Names combines old PrettyPrintEnv and old Names:
-- these HashQualified are fully qualified
{ terms :: Relation HashQualified Referent
, types :: Relation HashQualified Reference }
We should be able to construct one from a Codebase2
, given:
root :: Branch
current :: Branch
terms :: Set HashQualified
types :: Set HashQualified
or
root :: Branch
current :: Branch
terms :: Set Referent
types :: Set Reference
Needed functionality
Parsing a .u file:
-
Look up a Reference by name
-
Look up a Reference by hash-qualified name? We could avoid this by requiring that the user deconflict the names before parsing.
Parsing command-line arguments:
-
Look up a Reference by name.
-
Look up a Reference by hash-qualified name (possibly from among deleted names); for resolving conflicted names and edits.
/foo> todo These names are conflicted: foo#abc foo#xyz Use `rename` to change a names, or `unname` to remove one. These edits are conflicted: bar#fff -> bar#ggg : Nat (12 usages) bar#fff -> bar#hhh : Nat -> Nat (7 usages) bar#fff (Deprecated) Use `view bar#ggg bar#hhh` to view these choices. Use `edit.resolve` to choose a canonical replacement. Use `edit.unreplace` to cancel a replacement. Use `edit.undeprecate` to cancel a deprecation. Use `edit.replace bar#hhh bar#ggg` to start replacing the 7 usages of `bar#hhh` with `bar#ggg`. /foo> alias bar baz Not sure which bar you meant? bar#ggg bar#hhh Try specifying the hash-qualified name, or sort out the conflicts before making the alias.
/foo> edit.resolve bar#fff bar#ggg Cleared bar#fff -> bar#hhh Added bar#ggg -> bar#hhh
or
/foo> edit.unreplace bar#fff bar#ggg Cleared bar#fff -> bar#ggg
Pretty-printing:
- Select a name by Reference
Q: What to do about names outside the current branch?
Option 1: Don't support names outside the current branch; user must go up a level (possibly to the root), set up the names as desired, and then descend again.
Option 2: Introduce some syntax for names outside the current branch, e.g. _root_.Foo.bar
. We could first lookup references in the current branch, then in the root branch, then in the history of the root branch?
TODO tracking refactoring of existing functionality
-
Add edits/patches to Namespace / Branch
-
Add patch to
NameTarget
-
rename
propagate
topatch
- moves names from old hash to new hash, transitively, to the type-preserving frontier
-
list [path]
by default, don't descend into links with names that start with_
-
todo <edit> [path]
- list conflicted names (hash-qualified) and edit frontier
-
update <edit> [path]
when updating a term, old names goes into./_archived
, which will be largely conflicted.
-
propagate <edit> [path]
-
edit.resolve <patch> <hq> <hq>
Old names use case 1:
patch:
#a -> #b
#a -> #c
namelookup:
#b -> "foo"
#c -> "foo2"
"You have a conflicted edit:
#a -> foo#b
#a -> foo2#c
Please choose one.
"
/pc/libs/x> edit.resolve #a foo#b
You're in the middle of an edit, it's not type preserving
-
rename / move
rename.edits
rename.type
rename.term
-
name / copy
copy <[src][#hash]> <newname>
-
todo <edit> [path]
,update <edit> [path]
,propagate <edit> [path]
-
Implement
Branch.sync
operation that synchronizes a monadicBranch
to disk -
Implement something like
Branch.fromDirectory : FilePath -> IO (Branch IO)
for getting a lazy proxy for aBranch
- Also
Branch.fromExternal : (Path -> m ByteString) -> Hash -> m (Branch m)
- Could we create a
Branch
from a GitHub reference? Seems like yeah, it's just going to do some HTTP fetching.
- Also
-
Tweak
Codebase
toCodebase2
-
Implement a
Codebase2
forFileCodebase2
-
Implement
Actions2
-
Implement
Editor2
-
Implement
OutputMessages2
-
Implement
InputPatterns2
-
Go back and leave a spot for Link in serialized Branch0 format.
-
Split Edits out of
Branch0
-
Delete
oldNamespace
, and instead add deprecated names -
Parsing takes a
Names
, a map fromName
(fully-qualified name) toReferent
/Reference
. We should switch these fromMap
toName -> Optional xxx
, or evenName -> m (Optional xxx)
-
Context.synthesizeClosed
takes aTypeLookup
, which includes a map fromReference
toType
,DataDecl
,EffectDecl
. Shall we plan to include the full codebase here, or load them on demand? Maybe it doesn't matter yet.parseAndSynthesizeFile
takes aSet Reference -> m (TypeLookup v Ann)
, maybe that's a good model.
-
add
andupdate
will need a way to update theBranch'
at the current level, and all the way back to the root. Some kind of zipper? -
find
takes an optional path -
fork
takes aRepoPath
(or we could have a dedicated command likeclone
) -
merge
takes at least a path, if not aRepoPath
-
publish
orpush
that takes a local path and a remote path?
Branchless codebase format
Commands / Usage
/> clone gh:aryairani/libfoo
Copied gh:aryairani/libfoo blah blah to /libfoo
/> undo
/> clone gh:aryairani/libfoo /libs/DeepLearning/Foo
Copied gh:aryairani/libfoo blah blah to /libs/DeepLearning/Foo
/>
clone <remote> [path]
push [path] <remote>
/> cd projects
/projects> rename FaceDetector FaceDetector/V1
/projects> cd FaceDetector
/projects/FaceDetector> cp V1 V2
cd <path>
— support relative paths?
cp <path> <path>
/projects/FaceDetector> replace.scoped V2 /libs/DeepLearning/Foo/thing1 mything1
Noted replacement of thing1#af2 with mything#i9d within /projects/FaceDetector/V2.
replace.write <editsetid> <ref1> <ref2>
todo <editsetid> <path>
/projects/FaceDetector> todo
...7 things...
/projects/FaceDetector> todo /
...33 things...
/projects/FaceDetector>
mv
/ rename
command: can refer to Terms, Types, Directories, or all three. Use hash-qualified names to discriminate.
Namespaces
data Branch' m = Branch' (Causal m Namespace)
data Causal m e
= One { currentHash :: Hash, head :: e }
| Cons { currentHash :: Hash, head :: e, tail :: m (Causal e) }
-- The merge operation `<>` flattens and normalizes for order
| Merge { currentHash :: Hash, head :: e, tails :: Map Hash (m (Causal e)) }
-- just one level of name, like Foo or Bar, but not Foo.Bar
newtype NameSegment = NameSegment { toText :: Text } -- no dots, no slashes
newtype Path = Path { toList :: [NameSegment] }
data Namespace m = Namespace
{ terms :: Relation NameSegment Referent
, types :: Relation NameSegment Reference
, children :: Relation NameSegment (Branch' m)
}
Repo format:
# types
.unison/types/<hash>/compiled.ub
.unison/types/<hash>/dependents/<hash>
.unison/types/_builtin/<base58>/dependents/<hash>
# terms
.unison/terms/<hash>/compiled.ub
.unison/terms/<hash>/type.ub
.unison/terms/<hash>/dependents/<hash>
.unison/terms/_builtin/<base58>/dependents/<hash>
# branches (hashes of Causal m Namespace)
.unison/branches/<hash>.ubf
.unison/branches/head/<hash> -- if several, merge to produce new head.
Backup Names?
For pretty-printing, we want a name for every hash. Even for hashes we deleted the names for. 😐
-
When we delete a name
x
from path/p
(i.e./p/x
), we add the name/_deleted/p/x
. -
Or, do we just disallow removing the last name of things with dependencies?
-
When deleting a name, notify the user of the remaining names.
Edits
newtype EditMap = EditMap { toMap :: Map GUID (Causal Edits) }
data Edits = Edits
{ terms :: Relation Reference TermEdit
, types :: Relation Reference TypeEdit
}
type FriendlyEditNames = Relation Text GUID
Repo format:
.unison/edits/<guid>/<hash>
.unison/edits/<guid>/name/<base58> -- (base58encode (utf8encode "name of the edit"))
.unison/edits/<guid>/head/<hash> -- if several, merge to produce new head.
TODO: How to share these edits?
- It could be the same as sharing Unison names (e.g. if the edits were Unison terms)
- It could be the same as sharing Unison definitions:
Make up a URI that references a repo and an edit GUID.
e.g.
https://github.com/<user>/<repo>/<...>/<guid>[/hash]
clone.edits <remote-url> [local-name]
guid
comes from remote-url, and is locally given the namelocal-name
- if
local-name
is omitted, then copy name fromremote-url
. - if
local-name
already exists locally with a differentguid
, then abort.
Editsets as first-class unison terms:
Benefits:
- Don't have two separate dimensions of forking and causality (namespace vs edits).
- Makes codebase model way simpler to explain. <— BFD
Costs / todo:
Q: Do we allow users to edit EditSets
using standard view
and edit
in M1?
If Yes:
-
EditSets are arbitrary Unison programs that need to be evaluated. Once evaluated, they would have a known structure that can be decomposed for EditSet operations. We would need:
-
- some new or existing syntax for constructing EditSet values
- a way to evaluate these unison programs
- a way to save evaluated results back to the codebase / namespace
- Q: Do we evaluate and save these eagerly or lazily?
- a way in Haskell to deconstruct the EditSet value
- a way to modify (append to) values of that type using CLI commands. e.g.
update
?- either
update
calls a unison function that
- either
If no (we don't provide user syntax for constructing EditSets
in .u file):
- EditSets are part of the term language?
- Or a constructor with a particular hash? (Applied to Unison terms)
Collecting external dependencies
If a subtree references external dependencies, they should be given local names when exporting.
Given:
/A/B/c#xxx
/D/E/f#yyy (depends on #xxx, #zzz)
/D/G/h#zzz
/libs/G/bar#zzz
If /D/E
is published, what names should be assigned to #xxx
, #zzz
?
Idea 1: Names relative to nearest parent
Collect external dependencies under Dependencies
, using names relative to the nearest parent in common with the publication point?
i.e.:
f#yyy
Dependencies/A/B/C#xxx
Dependencies/G/h#zzz
Idea 2: Somehow derive from qualified imports used?
If
Idea 3: Surface the condition* to the user
*the condition = the publication node contains definitions that reference definitions not under the publication node.
Ask them to create aliases below the publication point?
Idea 4: Add external names to ./_auxNames/
The nearest aux-name would only be used to render code only if there were no primary names known.
Idea 5: Something with symlinks
data Branch' m = Branch' (Causal m Namespace)
data Causal m e
= One { currentHash :: Hash, head :: e }
| Cons { currentHash :: Hash, head :: e, tail :: m (Causal m e) }
-- The merge operation `<>` flattens and normalizes for order
| Merge { currentHash :: Hash, head :: e, tails :: Map Hash (m (Causal m e)) }
-- just one level of name, like Foo or Bar, but not Foo.Bar
newtype NameSegment = NameSegment { toText :: Text } -- no dots, no slashes
newtype Path = Path { toList :: [NameSegment] }
data Namespace m = Namespace
{ terms :: Relation NameSegment Referent
, types :: Relation NameSegment Reference
, children :: Relation NameSegment (Link m)
}
data Link m = LocalLink (Branch' m) | RemoteLink RemotePath
data RemotePath = Github { username :: Text, repo :: Text, commit :: Text } -- | ... future
This lets us avoid redistributing libs unnecessarily — let the requesting user get it from wherever we got it from. But it doesn't specifically address this external naming question.
We might be publishing /app/foo
which references definitions we got from repo1
. Somewhere in our tree (possibly under /app/foo
and possibly not?) we have a link to repo1
.
Somewhere under /app/foo
we reference some defn from repo1
.
Transitive publication algorithm:
- find all the things that you're referencing
- the things you're publishing that aren't under the pbulication point need to be resolved
- they're local, and need to be given names under the publication point
- user is notified, or we do something automatic
- they're remote, and we need to include, in the publication, a link to the remote repo.
- user is notified, or we do something automatic
- they're local, and need to be given names under the publication point
- "Something automatic" will be:
- mirror the dependency names from our namespace into
./_Libs
; if it would produce naming conflicts to use./_Libs
, then_Libs1
, etc. - Or, just dump them into
./_Libs
and if doing so produces naming conflicts, force the user to resolve them before publishing.
- mirror the dependency names from our namespace into
Syncing with remote codetrees
-- names tbd
data BranchPath = BranchPath RepoRef Path
data RepoRef = Local | GithubRef { username :: Text, repo :: Text, treeish :: Text }
/libs/community/DL
becomes ```haskell BranchPath Local (Path ["libs","community","DL"])
gh:/[/][?ref=] -- defaults to repo's default_branch
e.g. gh:aryairani/unison/libs?ref=topic/370
becomes
```haskell
BranchPath (GithubRef "aryairani" "unison" "topic/370") (Path ["libs"])
or
gh:user/repo[:treeish][/path]
e.g. github:aryairani/unison:topic/370/libs
becomes
BranchPath (GithubRef "'aryairani" "unison" "topic/370") (Path ["libs"])
Github Notes
Github uses a few different URL schemes. They call the ones you can pluck off their website "html_url"s. They let you refer to files and directories, and can be parameterized by git treeish (branch, tag, commit).
We can interpret these to refer to the root of a namespace. https://github.com/unisonweb/unison can be interpreted as:
GithubRef "unisonweb" "unison" <$> getDefaultBranch "unisonweb" "unison"
The Github website will let you navigate to a git branch, e.g https://github.com/unisonweb/unison/tree/topic/370/ can be interpreted as:
GithubRef "unisonweb" "unison" <$> matchBranch "unisonweb" "unison" "topic/370/"
Branch names can contain slashes, such as topic/370
, complicating parsing if there's meant to be path info following the branch name.
- Fortunately, if you have a git branch
a/b
then it's not possible to create branchesa
ora/b/c
. So you can load the list of branches from JSON, and then test them against that treeish-prefixed path without ambiguity. - Github's website doesn't know how to navigate into
Causal
structures, so it's never going to give us URLs with paths into a Unison namespace. So maybe this is a moot point.
So, I would still go ahead with the made-up gh:username/repo[:treeish][/path]
URI scheme; we can try to support the other URLs mentioned above, and let them refer to the root of the published namespace.
Our Javascript viewer can be made to create URLs with query params or fragments in them that can indicate the Unison path, and those can be the ones we share in tweets, etc:
http(s)://.github.io/?branch=&path= with the default branch being the head, and the default path being /
.