util: add a function to get absolute path without resolving symlinks

Summary:
On Windows, some people use the "map drive" feature to map a long path (ex.
`C:\long\path\to\repo`) to a short path (ex. `Z:\`) so their tooling can
handle some long paths.

In that case, resolving symlinks by `hg root` is undesirable.

Unfortunately, the Rust stdlib does not have a Python `os.path.abspath`
equivalent. There were some attempts (ex. https://github.com/rust-lang/rust/pull/47363)
but the corner cases (ex. symlinks) have made the problem much more
complicated.

There are some 3rd-party crates. But they are not a good fit:
- https://github.com/danreeves/path-clean/ (last commit fb84930) follows the golang plan9 idea. It does not have proper support for Windows paths.
- https://github.com/vitiral/path_abs/ (latest commit 8370838) reinvents many path-related types, which is an overkill for this usecase.

This diff implements the feature "reasonably" for both Windows and Linux, with
nasty corner cases (symlink) ignored.

Differential Revision: D16952485

fbshipit-source-id: ba91f4975c2e018362e2530119765a380f103e19
This commit is contained in:
Jun Wu 2019-08-21 17:58:21 -07:00 committed by Facebook Github Bot
parent 1387b3f3bf
commit 3e73af0a9a

View File

@ -5,7 +5,10 @@
//! Path-related utilities.
/// Normalize a normalized Path for display.
use std::io;
use std::path::{Component, Path, PathBuf};
/// Normalize a canonicalized Path for display.
///
/// This removes the UNC prefix `\\?\` on Windows.
pub fn normalize_for_display(path: &str) -> &str {
@ -24,3 +27,96 @@ pub fn normalize_for_display_bytes(path: &[u8]) -> &[u8] {
path
}
}
/// Return the absolute and normalized path without accessing the filesystem.
///
/// Unlike [`fs::canonicalize`], do not follow symlinks.
///
/// This function does not access the filesystem. Therefore it can behave
/// differently from the kernel or other library functions in corner cases.
/// For example:
///
/// - On some systems with symlink support, `foo/bar/..` and `foo` can be
/// different as seen by the kernel, if `foo/bar` is a symlink. This
/// function always returns `foo` in this case.
/// - On Windows, the official normalization rules are much more complicated.
/// See https://github.com/rust-lang/rust/pull/47363#issuecomment-357069527.
/// For example, this function cannot translate "drive relative" path like
/// "X:foo" to an absolute path.
///
/// Return an error if `std::env::current_dir()` fails or if this function
/// fails to produce an absolute path.
pub fn absolute(path: impl AsRef<Path>) -> io::Result<PathBuf> {
let path = path.as_ref();
let path = if path.is_absolute() {
path.to_path_buf()
} else {
std::env::current_dir()?.join(path)
};
if !path.is_absolute() {
return Err(io::Error::new(
io::ErrorKind::Other,
format!("cannot get absoltue path from {:?}", path),
));
}
let mut result = PathBuf::new();
for component in path.components() {
match component {
Component::Normal(_) | Component::RootDir | Component::Prefix(_) => {
result.push(component);
}
Component::ParentDir => {
result.pop();
}
Component::CurDir => (),
}
}
Ok(result)
}
#[cfg(test)]
mod tests {
use super::*;
#[cfg(windows)]
mod windows {
use super::*;
#[test]
fn test_absolute_fullpath() {
assert_eq!(absolute("C:/foo").unwrap(), Path::new("C:\\foo"));
assert_eq!(
absolute("x:\\a/b\\./.\\c").unwrap(),
Path::new("x:\\a\\b\\c")
);
assert_eq!(
absolute("y:/a/b\\../..\\c\\../d\\./.").unwrap(),
Path::new("y:\\d")
);
assert_eq!(
absolute("z:/a/b\\../..\\../..\\..").unwrap(),
Path::new("z:\\")
);
}
}
#[cfg(unix)]
mod unix {
use super::*;
#[test]
fn test_absolute_fullpath() {
assert_eq!(absolute("/a/./b\\c/../d/.").unwrap(), Path::new("/a/d"));
assert_eq!(absolute("/a/../../../../b").unwrap(), Path::new("/b"));
assert_eq!(absolute("/../../..").unwrap(), Path::new("/"));
assert_eq!(absolute("/../../../").unwrap(), Path::new("/"));
assert_eq!(
absolute("//foo///bar//baz").unwrap(),
Path::new("/foo/bar/baz")
);
assert_eq!(absolute("//").unwrap(), Path::new("/"));
}
}
}