Merge branch 'trunk' into add-pkg-config

2024-11-10 10:02:38 +03:00 · 2020-11-07 22:06:08 -05:00 · 2020-11-07 22:06:08 -05:00 · 6929c63bb8
commit 6929c63bb8
parent c20b40a7d2 8c8412e248
22 changed files with 11740 additions and 421 deletions
--- a/cli/tests/repl_eval.rs
+++ b/cli/tests/repl_eval.rs
@ -110,6 +110,11 @@ mod repl_eval {
        );
    }

+    #[test]
+    fn str_count_graphemes() {
+        expect_success("Str.countGraphemes \"å🤔\"", "2 : Int");
+    }
+
    #[test]
    fn literal_empty_list() {
        expect_success("[]", "[] : List *");
--- a/compiler/builtins/README.md
+++ b/compiler/builtins/README.md
@ -58,12 +58,44 @@ This is where bottom-level functions that need to be written as LLVM are created
 ### builtins/src/std.rs
 Its one thing to actually write these functions, its _another_ thing to let the Roc compiler know they exist as part of the standard library. You have to tell the compiler "Hey, this function exists, and it has this type signature". That happens in `std.rs`.

+## Specifying how we pass args to the function
+### builtins/mono/src/borrow.rs
+After we have all of this, we need to specify if the arguements we're passing are owned, borrowed or irrelvant. Towards the bottom of this file, add a new case for you builtin and specify each arg. Be sure to read the comment, as it explains this in more detail.
+
 ## Specifying the uniqueness of a function
 ### builtins/src/unique.rs
 One of the cool things about Roc is that it evaluates if a value in memory is shared between scopes or if it is used in just one place. If the value is used in one place then it is 'unique', and it therefore can be mutated in place. For a value created by a function, the uniqueness of the output is determined in part by the uniqueness of the input arguments. For example `List.single : elem -> List elem` can return a unique list if the `elem` is also unique.

 We have to define the uniqueness constraints of a function just like we have to define a type signature. That is what happens in `unique.rs`. This can be tricky so it would be a good step to ask for help on if it is confusing.

+## Testing it
+### solve/tests/solve_expr.rs
+To make sure that Roc is properly inferring the type of the new builtin, add a test to this file simlar to:
+```
+ #[test]
+fn atan() {
+    infer_eq_without_problem(
+        indoc!(
+            r#"
+            Num.atan
+            "#
+        ),
+        "Float -> Float",
+    );
+}
+```
+But replace `Num.atan` and the type signature with the new builtin.
+
+### gen/test/*.rs
+In this directory, there are a couple files like `gen_num.rs`, `gen_str.rs`, etc. For the `Str` module builtins, put the test in `gen_str.rs`, etc. Find the one for the new builtin, and add a test like:
+```
+#[test]
+fn atan() {
+    assert_evals_to!("Num.atan 10", 1.4711276743037347, f64);
+}
+    ```
+But replace `Num.atan`, the return value, and the return type with your new builtin.
+
 # Mistakes that are easy to make!!

 When implementing a new builtin, it is often easy to copy and paste the implementation for an existing builtin. This can take you quite far since many builtins are very similar, but it also risks forgetting to change one small part of what you copy and pasted and losing a lot of time later on when you cant figure out why things dont work. So, speaking from experience, even if you are copying an existing builtin, try and implement it manually without copying and pasting. Two recent instances of this (as of September 7th, 2020):
--- a/compiler/builtins/bitcode/.gitignore
+++ b/compiler/builtins/bitcode/.gitignore
@ -1,2 +1,4 @@
 zig-cache
 src/zig-cache
+builtins.ll
+builtins.bc
--- a/compiler/builtins/bitcode/README.md
+++ b/compiler/builtins/bitcode/README.md
@ -1,5 +1,16 @@
 # Bitcode for Builtins

+## Adding a bitcode builtin
+
+To add a builtin:
+1. Add the function to the relevent module. For `Num` builtin use it in `src/num.zig`, for `Str` builtins use `src/str.zig`, and so on. **For anything you add, you must add tests for it!** Not only does to make the builtins more maintainable, it's the the easiest way to test these functions on Zig. To run the test, run: `zig build test`
+2. Make sure the function is public with the `pub` keyword and uses the C calling convention. This is really easy, just add `pub` and `callconv(.C)` to the function declaration like so: `pub fn atan(num: f64) callconv(.C) f64 { ... }`
+3. In `src/main.zig`, export the function. This is also organized by module. For example, for a `Num` function find the `Num` section and add: `comptime { exportNumFn(num.atan, "atan"); }`. The first arguement is the function, the second is the name of it in LLVM.
+4. In `compiler/builtins/src/bitcode.rs`, add a constant for the new function. This is how we use it in Rust. Once again, this is organized by module, so just find the relevent area and add your new function.
+5. You can now  your function in Rust using `call_bitcode_fn` in `llvm/src/build.rs`!
+
+## How it works
+
 Roc's builtins are implemented in the compiler using LLVM only.
 When their implementations are simple enough (e.g. addition), they
 can be implemented directly in Inkwell.
@ -21,4 +32,4 @@ There will be two directories like `roc_builtins-[some random characters]`, look

 ## Calling bitcode functions

-Use the `call_bitcode_fn` function defined in `llvm/src/build.rs` to call bitcode funcitons.
+use the `call_bitcode_fn` function defined in `llvm/src/build.rs` to call bitcode funcitons.
--- a/compiler/builtins/bitcode/build.zig
+++ b/compiler/builtins/bitcode/build.zig
@ -0,0 +1,59 @@
+const builtin = @import("builtin");
+const std = @import("std");
+const mem = std.mem;
+const Builder = std.build.Builder;
+
+pub fn build(b: *Builder) void {
+    b.setPreferredReleaseMode(builtin.Mode.ReleaseFast);
+    const mode = b.standardReleaseOptions();
+
+    // Options
+    const fallback_main_path = "./src/main.zig";
+    const main_path_desc = b.fmt("Override path to main.zig. Used by \"ir\", \"bc\", and \"test\". Defaults to \"{}\". ", .{fallback_main_path});
+    const main_path = b.option([]const u8, "main-path", main_path_desc) orelse fallback_main_path;
+
+    const fallback_bitcode_path = "./builtins.bc";
+    const bitcode_path_desc = b.fmt("Override path to generated bitcode file. Used by \"ir\" and \"bc\". Defaults to \"{}\". ", .{fallback_bitcode_path});
+    const bitcode_path = b.option([]const u8, "bc-path", bitcode_path_desc) orelse fallback_bitcode_path;
+
+    // Tests
+    var main_tests = b.addTest(main_path);
+    main_tests.setBuildMode(mode);
+    const test_step = b.step("test", "Run tests");
+    test_step.dependOn(&main_tests.step);
+
+    // Lib
+    const obj_name = "builtins";
+    const obj = b.addObject(obj_name, main_path);
+    obj.setBuildMode(mode);
+    obj.strip = true;
+    obj.emit_llvm_ir = true;
+    obj.emit_bin = false;
+    const ir = b.step("ir", "Build LLVM ir");
+    ir.dependOn(&obj.step);
+
+    // IR to Bitcode
+    const bitcode_path_arg = b.fmt("-o={}", .{bitcode_path});
+    const ir_out_file = b.fmt("{}.ll", .{obj_name});
+    const ir_to_bitcode = b.addSystemCommand(&[_][]const u8{
+        "llvm-as-10",
+        ir_out_file,
+        bitcode_path_arg
+    });
+
+    const bicode = b.step("bc", "Build LLVM ir and convert to bitcode");
+    bicode.dependOn(ir);
+    bicode.dependOn(&ir_to_bitcode.step);
+
+    b.default_step = ir;
+    removeInstallSteps(b);
+}
+
+fn removeInstallSteps(b: *Builder) void {
+    for (b.top_level_steps.items) |top_level_step, i| {
+        if (mem.eql(u8, top_level_step.step.name, "install") or mem.eql(u8, top_level_step.step.name, "uninstall")) {
+            const name = top_level_step.step.name;
+            _ = b.top_level_steps.swapRemove(i);
+        }
+    }
+}
--- a/compiler/builtins/bitcode/src/helpers/grapheme.zig
+++ b/compiler/builtins/bitcode/src/helpers/grapheme.zig
--- a/compiler/builtins/bitcode/src/main.zig
+++ b/compiler/builtins/bitcode/src/main.zig
@ -1,384 +1,34 @@
+const builtin = @import("builtin");
 const std = @import("std");
-const math = std.math;
-const expect = std.testing.expect;
+const testing = std.testing;

-const roc_builtins_namespace = "roc_builtins";
-const math_namespace = roc_builtins_namespace ++ ".math";
-const str_namespace = roc_builtins_namespace ++ ".str";
+// Num Module
+const num = @import("num.zig");
+comptime { exportNumFn(num.atan, "atan"); }
+comptime { exportNumFn(num.isFinite, "is_finite"); }
+comptime { exportNumFn(num.powInt, "pow_int"); }
+comptime { exportNumFn(num.acos, "acos"); }
+comptime { exportNumFn(num.asin, "asin"); }

-comptime { @export(atan, .{ .name = math_namespace ++ ".atan", .linkage = .Strong  }); }
-fn atan(num: f64) callconv(.C) f64 {
-    return math.atan(num);
+// Str Module
+const str = @import("str.zig");
+comptime { exportStrFn(str.strSplitInPlace, "str_split_in_place"); }
+comptime { exportStrFn(str.countSegments, "count_segements"); }
+comptime { exportStrFn(str.countGraphemeClusters, "count_grapheme_clusters"); }
+
+// Export helpers - Must be run inside a comptime
+fn exportBuiltinFn(comptime fn_target: anytype, comptime fn_name: []const u8) void {
+    @export(fn_target, .{ .name = "roc_builtins." ++ fn_name, .linkage = .Strong  });
+}
+fn exportNumFn(comptime fn_target: anytype, comptime fn_name: []const u8) void {
+    exportBuiltinFn(fn_target, "num." ++ fn_name);
+}
+fn exportStrFn(comptime fn_target: anytype, comptime fn_name: []const u8) void {
+    exportBuiltinFn(fn_target, "str." ++ fn_name);
 }

-comptime { @export(isFinite, .{ .name = math_namespace ++ ".is_finite", .linkage = .Strong  }); }
-fn isFinite(num: f64) callconv(.C) bool {
-    return math.isFinite(num);
-}
-
-comptime { @export(powInt, .{ .name = math_namespace ++ ".pow_int", .linkage = .Strong  }); }
-fn powInt(base: i64, exp: i64) callconv(.C) i64 {
-    return math.pow(i64, base, exp);
-}
-
-comptime { @export(acos, .{ .name = math_namespace ++ ".acos", .linkage = .Strong  }); }
-fn acos(num: f64) callconv(.C) f64 {
-    return math.acos(num);    
-}
-
-comptime { @export(asin, .{ .name = math_namespace ++ ".asin", .linkage = .Strong  }); }
-fn asin(num: f64) callconv(.C) f64 {
-    return math.asin(num);
-}
-
-
-// Str.split
-
-const RocStr = struct {
-    str_bytes_ptrs: [*]u8,
-    str_len: usize,
-
-    pub fn init(bytes: [*]u8, len: usize) RocStr {
-        return RocStr {
-            .str_bytes_ptrs = bytes,
-            .str_len = len
-        };
-    }
-
-    pub fn eq(self: RocStr, other: RocStr) bool {
-        if (self.str_len != other.str_len) {
-            return false;
-        }
-
-        var areEq: bool = true;
-        var index: usize = 0;
-        while (index < self.str_len and areEq) {
-            areEq = areEq and self.str_bytes_ptrs[index] == other.str_bytes_ptrs[index];
-            index = index + 1;
-        }
-
-        return areEq;
-    }
-
-    test "RocStr.eq: equal" {
-        const str1_len = 3;
-        var str1: [str1_len]u8 = "abc".*;
-        const str1_ptr: [*]u8 = &str1;
-        var roc_str1 = RocStr.init(str1_ptr, str1_len);
-
-        const str2_len = 3;
-        var str2: [str2_len]u8 = "abc".*;
-        const str2_ptr: [*]u8 = &str2;
-        var roc_str2 = RocStr.init(str2_ptr, str2_len);
-
-        expect(RocStr.eq(roc_str1, roc_str2));
-    }
-
-    test "RocStr.eq: not equal different length" {
-        const str1_len = 4;
-        var str1: [str1_len]u8 = "abcd".*;
-        const str1_ptr: [*]u8 = &str1;
-        var roc_str1 = RocStr.init(str1_ptr, str1_len);
-
-        const str2_len = 3;
-        var str2: [str2_len]u8 = "abc".*;
-        const str2_ptr: [*]u8 = &str2;
-        var roc_str2 = RocStr.init(str2_ptr, str2_len);
-
-        expect(!RocStr.eq(roc_str1, roc_str2));
-    }
-
-    test "RocStr.eq: not equal same length" {
-        const str1_len = 3;
-        var str1: [str1_len]u8 = "acb".*;
-        const str1_ptr: [*]u8 = &str1;
-        var roc_str1 = RocStr.init(str1_ptr, str1_len);
-
-        const str2_len = 3;
-        var str2: [str2_len]u8 = "abc".*;
-        const str2_ptr: [*]u8 = &str2;
-        var roc_str2 = RocStr.init(str2_ptr, str2_len);
-
-        expect(!RocStr.eq(roc_str1, roc_str2));
-    }
-};
-
-comptime { @export(strSplitInPlace, .{ .name = str_namespace ++ ".str_split_in_place", .linkage = .Strong  }); }
-fn strSplitInPlace(
-    array: [*]RocStr,
-    array_len: usize,
-    str_bytes_ptrs: [*]u8,
-    str_len: usize,
-    delimiter_bytes: [*]u8,
-    delimiter_len: usize
-) callconv(.C) void {
-    var ret_array_index : usize = 0;
-
-    var sliceStart_index : usize = 0;
-
-    var str_index : usize = 0;
-
-    if (str_len > delimiter_len) {
-        const end_index : usize = str_len - delimiter_len;
-        while (str_index <= end_index) {
-            var delimiter_index : usize = 0;
-            var matches_delimiter = true;
-
-            while (delimiter_index < delimiter_len) {
-                var delimiterChar = delimiter_bytes[delimiter_index];
-                var strChar = str_bytes_ptrs[str_index + delimiter_index];
-
-                if (delimiterChar != strChar) {
-                    matches_delimiter = false;
-                    break;
-                }
-
-                delimiter_index += 1;
-            }
-
-            if (matches_delimiter) {
-                array[ret_array_index] = RocStr.init(str_bytes_ptrs + sliceStart_index, str_index - sliceStart_index);
-                sliceStart_index = str_index + delimiter_len;
-                ret_array_index += 1;
-                str_index += delimiter_len;
-            } else {
-                str_index += 1;
-            }
-        }
-    }
-
-    array[ret_array_index] = RocStr.init(str_bytes_ptrs + sliceStart_index, str_len - sliceStart_index);
-}
-
-test "strSplitInPlace: no delimiter" {
-    // Str.split "abc" "!" == [ "abc" ]
-
-    var str: [3]u8 = "abc".*;
-    const str_ptr: [*]u8 = &str;
-
-    var delimiter: [1]u8 = "!".*;
-    const delimiter_ptr: [*]u8 = &delimiter;
-
-    var array: [1]RocStr = undefined;
-    const array_ptr: [*]RocStr = &array;
-
-    strSplitInPlace(
-        array_ptr,
-        1,
-        str_ptr,
-        3,
-        delimiter_ptr,
-        1
-    );
-
-    var expected = [1]RocStr{
-        RocStr.init(str_ptr, 3),
-    };
-
-    expect(array.len == expected.len);
-    expect(RocStr.eq(array[0], expected[0]));
-}
-
-test "strSplitInPlace: delimiter on sides" {
-    // Str.split "tttghittt" "ttt" == [ "", "ghi", "" ]
-
-    const str_len: usize = 9;
-    var str: [str_len]u8 = "tttghittt".*;
-    const str_ptr: [*]u8 = &str;
-
-    const delimiter_len = 3;
-    var delimiter: [delimiter_len]u8 = "ttt".*;
-    const delimiter_ptr: [*]u8 = &delimiter;
-
-    const array_len : usize = 3;
-    var array: [array_len]RocStr = [_]RocStr{
-        undefined ,
-        undefined,
-        undefined,
-    };
-    const array_ptr: [*]RocStr = &array;
-
-    strSplitInPlace(
-        array_ptr,
-        array_len,
-        str_ptr,
-        str_len,
-        delimiter_ptr,
-        delimiter_len
-    );
-
-    const expected_str_len: usize = 3;
-    var expected_str: [expected_str_len]u8 = "ghi".*;
-    const expected_str_ptr: [*]u8 = &expected_str;
-    var expectedRocStr = RocStr.init(expected_str_ptr, expected_str_len);
-
-    expect(array.len == 3);
-    expect(array[0].str_len == 0);
-    expect(RocStr.eq(array[1], expectedRocStr));
-    expect(array[2].str_len == 0);
-}
-
-test "strSplitInPlace: three pieces" {
-    // Str.split "a!b!c" "!" == [ "a", "b", "c" ]
-
-    const str_len: usize = 5;
-    var str: [str_len]u8 = "a!b!c".*;
-    const str_ptr: [*]u8 = &str;
-
-    const delimiter_len = 1;
-    var delimiter: [delimiter_len]u8 = "!".*;
-    const delimiter_ptr: [*]u8 = &delimiter;
-
-    const array_len : usize = 3;
-    var array: [array_len]RocStr = undefined;
-    const array_ptr: [*]RocStr = &array;
-
-    strSplitInPlace(
-        array_ptr,
-        array_len,
-        str_ptr,
-        str_len,
-        delimiter_ptr,
-        delimiter_len
-    );
-
-    var a: [1]u8 = "a".*;
-    const a_ptr: [*]u8 = &a;
-
-    var b: [1]u8 = "b".*;
-    const b_ptr: [*]u8 = &b;
-
-    var c: [1]u8 = "c".*;
-    const c_ptr: [*]u8 = &c;
-
-    var expected_array = [array_len]RocStr{
-        RocStr{
-            .str_bytes_ptrs = a_ptr,
-            .str_len = 1,
-        },
-        RocStr{
-            .str_bytes_ptrs = b_ptr,
-            .str_len = 1,
-        },
-        RocStr{
-            .str_bytes_ptrs = c_ptr,
-            .str_len = 1,
-        }
-    };
-
-    expect(expected_array.len == array.len);
-    expect(RocStr.eq(array[0], expected_array[0]));
-    expect(RocStr.eq(array[1], expected_array[1]));
-    expect(RocStr.eq(array[2], expected_array[2]));
-}
-
-// This is used for `Str.split : Str, Str -> Array Str
-// It is used to count how many segments the input `_str`
-// needs to be broken into, so that we can allocate a array
-// of that size. It always returns at least 1.
-comptime { @export(countSegments, .{ .name = str_namespace ++ ".count_segements", .linkage = .Strong  }); }
-fn countSegments(
-    str_bytes_ptrs: [*]u8,
-    str_len: usize,
-    delimiter_bytes: [*]u8,
-    delimiter_len: usize
-) callconv(.C) i64 {
-    var count: i64 = 1;
-
-    if (str_len > delimiter_len) {
-        var str_index: usize = 0;
-        const end_cond: usize = str_len - delimiter_len;
-
-        while (str_index < end_cond) {
-            var delimiter_index: usize = 0;
-
-            var matches_delimiter = true;
-
-            while (delimiter_index < delimiter_len) {
-                const delimiterChar = delimiter_bytes[delimiter_index];
-                const strChar = str_bytes_ptrs[str_index + delimiter_index];
-
-                if (delimiterChar != strChar) {
-                    matches_delimiter = false;
-                    break;
-                }
-
-                delimiter_index += 1;
-            }
-
-            if (matches_delimiter) {
-                count += 1;
-            }
-
-            str_index += 1;
-        }
-    }
-
-    return count;
-}
-
-test "countSegments: long delimiter" {
-    // Str.split "str" "delimiter" == [ "str" ]
-    // 1 segment
-
-    const str_len: usize = 3;
-    var str: [str_len]u8 = "str".*;
-    const str_ptr: [*]u8 = &str;
-
-    const delimiter_len = 9;
-    var delimiter: [delimiter_len]u8 = "delimiter".*;
-    const delimiter_ptr: [*]u8 = &delimiter;
-
-    const segments_count = countSegments(
-        str_ptr,
-        str_len,
-        delimiter_ptr,
-        delimiter_len
-    );
-
-    expect(segments_count == 1);
-}
-
-test "countSegments: delimiter at start" {
-    // Str.split "hello there" "hello" == [ "", " there" ]
-    // 2 segments
-
-    const str_len: usize = 11;
-    var str: [str_len]u8 = "hello there".*;
-    const str_ptr: [*]u8 = &str;
-
-    const delimiter_len = 5;
-    var delimiter: [delimiter_len]u8 = "hello".*;
-    const delimiter_ptr: [*]u8 = &delimiter;
-
-    const segments_count = countSegments(
-        str_ptr,
-        str_len,
-        delimiter_ptr,
-        delimiter_len
-    );
-
-    expect(segments_count == 2);
-}
-
-test "countSegments: delimiter interspered" {
-    // Str.split "a!b!c" "!" == [ "a", "b", "c" ]
-    // 3 segments
-
-    const str_len: usize = 5;
-    var str: [str_len]u8 = "a!b!c".*;
-    const str_ptr: [*]u8 = &str;
-
-    const delimiter_len = 1;
-    var delimiter: [delimiter_len]u8 = "!".*;
-    const delimiter_ptr: [*]u8 = &delimiter;
-
-    const segments_count = countSegments(
-        str_ptr,
-        str_len,
-        delimiter_ptr,
-        delimiter_len
-    );
-
-    expect(segments_count == 3);
+// Run all tests in imported modules
+// https://github.com/ziglang/zig/blob/master/lib/std/std.zig#L94
+test "" {
+    testing.refAllDecls(@This());
 }
--- a/compiler/builtins/bitcode/src/num.zig
+++ b/compiler/builtins/bitcode/src/num.zig
@ -0,0 +1,22 @@
+const std = @import("std");
+const math = std.math;
+
+pub fn atan(num: f64) callconv(.C) f64 {
+    return math.atan(num);
+}
+
+pub fn isFinite(num: f64) callconv(.C) bool {
+    return math.isFinite(num);
+}
+
+pub fn powInt(base: i64, exp: i64) callconv(.C) i64 {
+    return math.pow(i64, base, exp);
+}
+
+pub fn acos(num: f64) callconv(.C) f64 {
+    return math.acos(num);
+}
+
+pub fn asin(num: f64) callconv(.C) f64 {
+    return math.asin(num);
+}
--- a/compiler/builtins/bitcode/src/str.zig
+++ b/compiler/builtins/bitcode/src/str.zig
@ -0,0 +1,437 @@
+const std = @import("std");
+const unicode = std.unicode;
+const testing = std.testing;
+const expectEqual = testing.expectEqual;
+const expect = testing.expect;
+
+const RocStr = struct {
+    str_bytes_ptrs: [*]u8,
+    str_len: usize,
+
+    pub fn init(bytes: [*]u8, len: usize) RocStr {
+        return RocStr {
+            .str_bytes_ptrs = bytes,
+            .str_len = len
+        };
+    }
+
+    pub fn eq(self: *RocStr, other: RocStr) bool {
+        if (self.str_len != other.str_len) {
+            return false;
+        }
+
+        var areEq: bool = true;
+        var index: usize = 0;
+        while (index < self.str_len and areEq) {
+            areEq = areEq and self.str_bytes_ptrs[index] == other.str_bytes_ptrs[index];
+            index = index + 1;
+        }
+
+        return areEq;
+    }
+
+    test "RocStr.eq: equal" {
+        const str1_len = 3;
+        var str1: [str1_len]u8 = "abc".*;
+        const str1_ptr: [*]u8 = &str1;
+        var roc_str1 = RocStr.init(str1_ptr, str1_len);
+
+        const str2_len = 3;
+        var str2: [str2_len]u8 = "abc".*;
+        const str2_ptr: [*]u8 = &str2;
+        var roc_str2 = RocStr.init(str2_ptr, str2_len);
+
+        expect(roc_str1.eq(roc_str2));
+    }
+
+    test "RocStr.eq: not equal different length" {
+        const str1_len = 4;
+        var str1: [str1_len]u8 = "abcd".*;
+        const str1_ptr: [*]u8 = &str1;
+        var roc_str1 = RocStr.init(str1_ptr, str1_len);
+
+        const str2_len = 3;
+        var str2: [str2_len]u8 = "abc".*;
+        const str2_ptr: [*]u8 = &str2;
+        var roc_str2 = RocStr.init(str2_ptr, str2_len);
+
+        expect(!roc_str1.eq(roc_str2));
+    }
+
+    test "RocStr.eq: not equal same length" {
+        const str1_len = 3;
+        var str1: [str1_len]u8 = "acb".*;
+        const str1_ptr: [*]u8 = &str1;
+        var roc_str1 = RocStr.init(str1_ptr, str1_len);
+
+        const str2_len = 3;
+        var str2: [str2_len]u8 = "abc".*;
+        const str2_ptr: [*]u8 = &str2;
+        var roc_str2 = RocStr.init(str2_ptr, str2_len);
+
+        expect(!roc_str1.eq(roc_str2));
+    }
+};
+
+// Str.split
+
+pub fn strSplitInPlace(
+    array: [*]RocStr,
+    array_len: usize,
+    str_bytes_ptrs: [*]u8,
+    str_len: usize,
+    delimiter_bytes: [*]u8,
+    delimiter_len: usize
+) callconv(.C) void {
+    var ret_array_index : usize = 0;
+
+    var sliceStart_index : usize = 0;
+
+    var str_index : usize = 0;
+
+    if (str_len > delimiter_len) {
+        const end_index : usize = str_len - delimiter_len;
+        while (str_index <= end_index) {
+            var delimiter_index : usize = 0;
+            var matches_delimiter = true;
+
+            while (delimiter_index < delimiter_len) {
+                var delimiterChar = delimiter_bytes[delimiter_index];
+                var strChar = str_bytes_ptrs[str_index + delimiter_index];
+
+                if (delimiterChar != strChar) {
+                    matches_delimiter = false;
+                    break;
+                }
+
+                delimiter_index += 1;
+            }
+
+            if (matches_delimiter) {
+                array[ret_array_index] = RocStr.init(str_bytes_ptrs + sliceStart_index, str_index - sliceStart_index);
+                sliceStart_index = str_index + delimiter_len;
+                ret_array_index += 1;
+                str_index += delimiter_len;
+            } else {
+                str_index += 1;
+            }
+        }
+    }
+
+    array[ret_array_index] = RocStr.init(str_bytes_ptrs + sliceStart_index, str_len - sliceStart_index);
+}
+
+test "strSplitInPlace: no delimiter" {
+    // Str.split "abc" "!" == [ "abc" ]
+
+    var str: [3]u8 = "abc".*;
+    const str_ptr: [*]u8 = &str;
+
+    var delimiter: [1]u8 = "!".*;
+    const delimiter_ptr: [*]u8 = &delimiter;
+
+    var array: [1]RocStr = undefined;
+    const array_ptr: [*]RocStr = &array;
+
+    strSplitInPlace(
+        array_ptr,
+        1,
+        str_ptr,
+        3,
+        delimiter_ptr,
+        1
+    );
+
+    var expected = [1]RocStr{
+        RocStr.init(str_ptr, 3),
+    };
+
+    expectEqual(array.len, expected.len);
+    expect(array[0].eq(expected[0]));
+}
+
+test "strSplitInPlace: delimiter on sides" {
+    // Str.split "tttghittt" "ttt" == [ "", "ghi", "" ]
+
+    const str_len: usize = 9;
+    var str: [str_len]u8 = "tttghittt".*;
+    const str_ptr: [*]u8 = &str;
+
+    const delimiter_len = 3;
+    var delimiter: [delimiter_len]u8 = "ttt".*;
+    const delimiter_ptr: [*]u8 = &delimiter;
+
+    const array_len : usize = 3;
+    var array: [array_len]RocStr = [_]RocStr{
+        undefined ,
+        undefined,
+        undefined,
+    };
+    const array_ptr: [*]RocStr = &array;
+
+    strSplitInPlace(
+        array_ptr,
+        array_len,
+        str_ptr,
+        str_len,
+        delimiter_ptr,
+        delimiter_len
+    );
+
+    const expected_str_len: usize = 3;
+    var expected_str: [expected_str_len]u8 = "ghi".*;
+    const expected_str_ptr: [*]u8 = &expected_str;
+    var expectedRocStr = RocStr.init(expected_str_ptr, expected_str_len);
+
+    expectEqual(array.len, 3);
+    expectEqual(array[0].str_len, 0);
+    expect(array[1].eq(expectedRocStr));
+    expectEqual(array[2].str_len, 0);
+}
+
+test "strSplitInPlace: three pieces" {
+    // Str.split "a!b!c" "!" == [ "a", "b", "c" ]
+
+    const str_len: usize = 5;
+    var str: [str_len]u8 = "a!b!c".*;
+    const str_ptr: [*]u8 = &str;
+
+    const delimiter_len = 1;
+    var delimiter: [delimiter_len]u8 = "!".*;
+    const delimiter_ptr: [*]u8 = &delimiter;
+
+    const array_len : usize = 3;
+    var array: [array_len]RocStr = undefined;
+    const array_ptr: [*]RocStr = &array;
+
+    strSplitInPlace(
+        array_ptr,
+        array_len,
+        str_ptr,
+        str_len,
+        delimiter_ptr,
+        delimiter_len
+    );
+
+    var a: [1]u8 = "a".*;
+    const a_ptr: [*]u8 = &a;
+
+    var b: [1]u8 = "b".*;
+    const b_ptr: [*]u8 = &b;
+
+    var c: [1]u8 = "c".*;
+    const c_ptr: [*]u8 = &c;
+
+    var expected_array = [array_len]RocStr{
+        RocStr{
+            .str_bytes_ptrs = a_ptr,
+            .str_len = 1,
+        },
+        RocStr{
+            .str_bytes_ptrs = b_ptr,
+            .str_len = 1,
+        },
+        RocStr{
+            .str_bytes_ptrs = c_ptr,
+            .str_len = 1,
+        }
+    };
+
+    expectEqual(expected_array.len, array.len);
+    expect(array[0].eq(expected_array[0]));
+    expect(array[1].eq(expected_array[1]));
+    expect(array[2].eq(expected_array[2]));
+}
+
+// This is used for `Str.split : Str, Str -> Array Str
+// It is used to count how many segments the input `_str`
+// needs to be broken into, so that we can allocate a array
+// of that size. It always returns at least 1.
+pub fn countSegments(
+    str_bytes_ptrs: [*]u8,
+    str_len: usize,
+    delimiter_bytes: [*]u8,
+    delimiter_len: usize
+) callconv(.C) i64 {
+    var count: i64 = 1;
+
+    if (str_len > delimiter_len) {
+        var str_index: usize = 0;
+        const end_cond: usize = str_len - delimiter_len;
+
+        while (str_index < end_cond) {
+            var delimiter_index: usize = 0;
+
+            var matches_delimiter = true;
+
+            while (delimiter_index < delimiter_len) {
+                const delimiterChar = delimiter_bytes[delimiter_index];
+                const strChar = str_bytes_ptrs[str_index + delimiter_index];
+
+                if (delimiterChar != strChar) {
+                    matches_delimiter = false;
+                    break;
+                }
+
+                delimiter_index += 1;
+            }
+
+            if (matches_delimiter) {
+                count += 1;
+            }
+
+            str_index += 1;
+        }
+    }
+
+    return count;
+}
+
+test "countSegments: long delimiter" {
+    // Str.split "str" "delimiter" == [ "str" ]
+    // 1 segment
+
+    const str_len: usize = 3;
+    var str: [str_len]u8 = "str".*;
+    const str_ptr: [*]u8 = &str;
+
+    const delimiter_len = 9;
+    var delimiter: [delimiter_len]u8 = "delimiter".*;
+    const delimiter_ptr: [*]u8 = &delimiter;
+
+    const segments_count = countSegments(
+        str_ptr,
+        str_len,
+        delimiter_ptr,
+        delimiter_len
+    );
+
+    expectEqual(segments_count, 1);
+}
+
+test "countSegments: delimiter at start" {
+    // Str.split "hello there" "hello" == [ "", " there" ]
+    // 2 segments
+
+    const str_len: usize = 11;
+    var str: [str_len]u8 = "hello there".*;
+    const str_ptr: [*]u8 = &str;
+
+    const delimiter_len = 5;
+    var delimiter: [delimiter_len]u8 = "hello".*;
+    const delimiter_ptr: [*]u8 = &delimiter;
+
+    const segments_count = countSegments(
+        str_ptr,
+        str_len,
+        delimiter_ptr,
+        delimiter_len
+    );
+
+    expectEqual(segments_count, 2);
+}
+
+test "countSegments: delimiter interspered" {
+    // Str.split "a!b!c" "!" == [ "a", "b", "c" ]
+    // 3 segments
+
+    const str_len: usize = 5;
+    var str: [str_len]u8 = "a!b!c".*;
+    const str_ptr: [*]u8 = &str;
+
+    const delimiter_len = 1;
+    var delimiter: [delimiter_len]u8 = "!".*;
+    const delimiter_ptr: [*]u8 = &delimiter;
+
+    const segments_count = countSegments(
+        str_ptr,
+        str_len,
+        delimiter_ptr,
+        delimiter_len
+    );
+
+    expectEqual(segments_count, 3);
+}
+
+// Str.countGraphemeClusters
+const grapheme = @import("helpers/grapheme.zig");
+
+pub fn countGraphemeClusters(bytes_ptr: [*]u8, bytes_len: usize)  callconv(.C) usize {
+    var bytes = bytes_ptr[0..bytes_len];
+    var iter = (unicode.Utf8View.init(bytes) catch unreachable).iterator();
+
+    var count: usize = 0;
+    var grapheme_break_state: ?grapheme.BoundClass = null;
+    var grapheme_break_state_ptr = &grapheme_break_state;
+    var opt_last_codepoint: ?u21 = null;
+    while (iter.nextCodepoint()) |cur_codepoint| {
+        if (opt_last_codepoint) |last_codepoint| {
+            var did_break = grapheme.isGraphemeBreak(
+                last_codepoint,
+                cur_codepoint,
+                grapheme_break_state_ptr
+            );
+            if (did_break) {
+                count += 1;
+                grapheme_break_state = null;
+            }
+        }
+        opt_last_codepoint = cur_codepoint;
+    }
+
+    // If there are no breaks, but the str is not empty, then there
+    // must be a single grapheme
+    if (bytes_len != 0) {
+        count += 1;
+    }
+
+    return count;
+}
+
+test "countGraphemeClusters: empty string" {
+    var bytes_arr = "".*;
+    var bytes_len = bytes_arr.len;
+    var bytes_ptr: [*]u8 = &bytes_arr;
+    var count = countGraphemeClusters(bytes_ptr, bytes_len);
+    expectEqual(count, 0);
+}
+
+test "countGraphemeClusters: ascii characters" {
+    var bytes_arr = "abcd".*;
+    var bytes_len = bytes_arr.len;
+    var bytes_ptr: [*]u8 = &bytes_arr;
+    var count = countGraphemeClusters(bytes_ptr, bytes_len);
+    expectEqual(count, 4);
+}
+
+test "countGraphemeClusters: utf8 characters" {
+    var bytes_arr = "ãxā".*;
+    var bytes_len = bytes_arr.len;
+    var bytes_ptr: [*]u8 = &bytes_arr;
+    var count = countGraphemeClusters(bytes_ptr, bytes_len);
+    expectEqual(count, 3);
+}
+
+test "countGraphemeClusters: emojis" {
+    var bytes_arr = "🤔🤔🤔".*;
+    var bytes_len = bytes_arr.len;
+    var bytes_ptr: [*]u8 = &bytes_arr;
+    var count = countGraphemeClusters(bytes_ptr, bytes_len);
+    expectEqual(count, 3);
+}
+
+test "countGraphemeClusters: emojis and ut8 characters" {
+    var bytes_arr = "🤔å🤔¥🤔ç".*;
+    var bytes_len = bytes_arr.len;
+    var bytes_ptr: [*]u8 = &bytes_arr;
+    var count = countGraphemeClusters(bytes_ptr, bytes_len);
+    expectEqual(count, 6);
+}
+
+test "countGraphemeClusters: emojis, ut8, and ascii characters" {
+    var bytes_arr = "6🤔å🤔e¥🤔çpp".*;
+    var bytes_len = bytes_arr.len;
+    var bytes_ptr: [*]u8 = &bytes_arr;
+    var count = countGraphemeClusters(bytes_ptr, bytes_len);
+    expectEqual(count, 10);
+}
--- a/compiler/builtins/build.rs
+++ b/compiler/builtins/build.rs
@ -5,26 +5,7 @@ use std::path::Path;
 use std::process::Command;
 use std::str;

-fn run_command<S, I>(command: &str, args: I)
-where
-    I: IntoIterator<Item = S>,
-    S: AsRef<OsStr>,
-{
-    let output_result = Command::new(OsStr::new(&command)).args(args).output();
-    match output_result {
-        Ok(output) => match output.status.success() {
-            true => (),
-            false => {
-                let error_str = match str::from_utf8(&output.stderr) {
-                    Ok(stderr) => stderr.to_string(),
-                    Err(_) => format!("Failed to run \"{}\"", command),
-                };
-                panic!("{} failed: {}", command, error_str);
-            }
-        },
-        Err(reason) => panic!("{} failed: {}", command, reason),
-    }
-}
+// TODO: Use zig build system command instead

 fn main() {
    let out_dir = env::var_os("OUT_DIR").unwrap();
@ -64,7 +45,29 @@ fn main() {

    run_command("llvm-as-10", &[dest_ll, "-o", dest_bc]);

+    // TODO: Recursivly search zig src dir to watch for each file
    println!("cargo:rerun-if-changed=build.rs");
    println!("cargo:rerun-if-changed={}", src_path_str);
    println!("cargo:rustc-env=BUILTINS_BC={}", dest_bc);
 }
+
+fn run_command<S, I>(command: &str, args: I)
+where
+    I: IntoIterator<Item = S>,
+    S: AsRef<OsStr>,
+{
+    let output_result = Command::new(OsStr::new(&command)).args(args).output();
+    match output_result {
+        Ok(output) => match output.status.success() {
+            true => (),
+            false => {
+                let error_str = match str::from_utf8(&output.stderr) {
+                    Ok(stderr) => stderr.to_string(),
+                    Err(_) => format!("Failed to run \"{}\"", command),
+                };
+                panic!("{} failed: {}", command, error_str);
+            }
+        },
+        Err(reason) => panic!("{} failed: {}", command, reason),
+    }
+}
--- a/compiler/builtins/src/bitcode.rs
+++ b/compiler/builtins/src/bitcode.rs
@ -17,11 +17,12 @@ pub fn get_bytes() -> Vec<u8> {
    buffer
 }

-pub const MATH_ASIN: &str = "roc_builtins.math.asin";
-pub const MATH_ACOS: &str = "roc_builtins.math.acos";
-pub const MATH_ATAN: &str = "roc_builtins.math.atan";
-pub const MATH_IS_FINITE: &str = "roc_builtins.math.is_finite";
-pub const MATH_POW_INT: &str = "roc_builtins.math.pow_int";
+pub const NUM_ASIN: &str = "roc_builtins.num.asin";
+pub const NUM_ACOS: &str = "roc_builtins.num.acos";
+pub const NUM_ATAN: &str = "roc_builtins.num.atan";
+pub const NUM_IS_FINITE: &str = "roc_builtins.num.is_finite";
+pub const NUM_POW_INT: &str = "roc_builtins.num.pow_int";

 pub const STR_COUNT_SEGEMENTS: &str = "roc_builtins.str.count_segements";
-pub const STR_STR_SPLIT_IN_PLACE: &str = "roc_builtins.str.str_split_in_place";
+pub const STR_SPLIT_IN_PLACE: &str = "roc_builtins.str.str_split_in_place";
+pub const STR_COUNT_GRAPEHEME_CLUSTERS: &str = "roc_builtins.str.count_grapheme_clusters";
--- a/compiler/builtins/src/std.rs
+++ b/compiler/builtins/src/std.rs
@ -402,6 +402,12 @@ pub fn types() -> MutMap<Symbol, (SolvedType, Region)> {
        top_level_function(vec![str_type()], Box::new(bool_type())),
    );

+    // countGraphemes : Str -> Int
+    add_type(
+        Symbol::STR_COUNT_GRAPHEMES,
+        top_level_function(vec![str_type()], Box::new(int_type())),
+    );
+
    // List module

    // get : List elem, Int -> Result elem [ OutOfBounds ]*
--- a/compiler/builtins/src/unique.rs
+++ b/compiler/builtins/src/unique.rs
@ -1028,6 +1028,12 @@ pub fn types() -> MutMap<Symbol, (SolvedType, Region)> {
        unique_function(vec![str_type(star1), str_type(star2)], str_type(star3))
    });

+    // Str.countGraphemes : Attr * Str, -> Attr * Int
+    add_type(Symbol::STR_COUNT_GRAPHEMES, {
+        let_tvars! { star1, star2 };
+        unique_function(vec![str_type(star1)], int_type(star2))
+    });
+
    // Result module

    // map : Attr * (Result (Attr a e))
--- a/compiler/can/src/builtins.rs
+++ b/compiler/can/src/builtins.rs
@ -52,6 +52,7 @@ pub fn builtin_defs(var_store: &mut VarStore) -> MutMap<Symbol, Def> {
        Symbol::BOOL_NOT => bool_not,
        Symbol::STR_CONCAT => str_concat,
        Symbol::STR_IS_EMPTY => str_is_empty,
+        Symbol::STR_COUNT_GRAPHEMES => str_count_graphemes,
        Symbol::LIST_LEN => list_len,
        Symbol::LIST_GET => list_get,
        Symbol::LIST_SET => list_set,
@ -924,7 +925,7 @@ fn str_concat(symbol: Symbol, var_store: &mut VarStore) -> Def {
    )
 }

-/// Str.isEmpty : List * -> Bool
+/// Str.isEmpty : Str -> Bool
 fn str_is_empty(symbol: Symbol, var_store: &mut VarStore) -> Def {
    let str_var = var_store.fresh();
    let bool_var = var_store.fresh();
@ -944,6 +945,26 @@ fn str_is_empty(symbol: Symbol, var_store: &mut VarStore) -> Def {
    )
 }

+/// Str.countGraphemes : Str -> Int
+fn str_count_graphemes(symbol: Symbol, var_store: &mut VarStore) -> Def {
+    let str_var = var_store.fresh();
+    let int_var = var_store.fresh();
+
+    let body = RunLowLevel {
+        op: LowLevel::StrCountGraphemes,
+        args: vec![(str_var, Var(Symbol::ARG_1))],
+        ret_var: int_var,
+    };
+
+    defn(
+        symbol,
+        vec![(str_var, Symbol::ARG_1)],
+        var_store,
+        body,
+        int_var,
+    )
+}
+
 /// List.concat : List elem, List elem -> List elem
 fn list_concat(symbol: Symbol, var_store: &mut VarStore) -> Def {
    let list_var = var_store.fresh();
--- a/compiler/gen/src/llvm/bitcode.rs
+++ b/compiler/gen/src/llvm/bitcode.rs
@ -0,0 +1,21 @@
+use inkwell::types::BasicTypeEnum;
+use roc_module::low_level::LowLevel;
+
+pub fn call_bitcode_fn<'a, 'ctx, 'env>(
+    op: LowLevel,
+    env: &Env<'a, 'ctx, 'env>,
+    args: &[BasicValueEnum<'ctx>],
+    fn_name: &str,
+) -> BasicValueEnum<'ctx> {
+    let fn_val = env
+                .module
+                .get_function(fn_name)
+                .unwrap_or_else(|| panic!("Unrecognized builtin function: {:?} - if you're working on the Roc compiler, do you need to rebuild the bitcode? See compiler/builtins/bitcode/README.md", fn_name));
+    let call = env.builder.build_call(fn_val, args, "call_builtin");
+
+    call.set_call_convention(fn_val.get_call_conventions());
+
+    call.try_as_basic_value()
+        .left()
+        .unwrap_or_else(|| panic!("LLVM error: Invalid call for low-level op {:?}", op))
+}
--- a/compiler/gen/src/llvm/build.rs
+++ b/compiler/gen/src/llvm/build.rs
@ -4,7 +4,7 @@ use crate::llvm::build_list::{
    list_get_unsafe, list_join, list_keep_if, list_len, list_map, list_prepend, list_repeat,
    list_reverse, list_set, list_single, list_walk_right,
 };
-use crate::llvm::build_str::{str_concat, str_len, CHAR_LAYOUT};
+use crate::llvm::build_str::{str_concat, str_count_graphemes, str_len, CHAR_LAYOUT};
 use crate::llvm::compare::{build_eq, build_neq};
 use crate::llvm::convert::{
    basic_type_from_layout, block_of_memory, collection, get_fn_type, get_ptr_type, ptr_int,
@ -2556,6 +2556,12 @@ fn run_low_level<'a, 'ctx, 'env>(
            );
            BasicValueEnum::IntValue(is_zero)
        }
+        StrCountGraphemes => {
+            // Str.countGraphemes : Str -> Int
+            debug_assert_eq!(args.len(), 1);
+
+            str_count_graphemes(env, scope, parent, args[0])
+        }
        ListLen => {
            // List.len : List * -> Int
            debug_assert_eq!(args.len(), 1);
@ -3056,7 +3062,7 @@ fn build_int_binop<'a, 'ctx, 'env>(
            NumPowInt,
            env,
            &[lhs.into(), rhs.into()],
-            &bitcode::MATH_POW_INT,
+            &bitcode::NUM_POW_INT,
        ),
        _ => {
            unreachable!("Unrecognized int binary operation: {:?}", op);
@ -3064,7 +3070,7 @@ fn build_int_binop<'a, 'ctx, 'env>(
    }
 }

-fn call_bitcode_fn<'a, 'ctx, 'env>(
+pub fn call_bitcode_fn<'a, 'ctx, 'env>(
    op: LowLevel,
    env: &Env<'a, 'ctx, 'env>,
    args: &[BasicValueEnum<'ctx>],
@ -3105,7 +3111,7 @@ fn build_float_binop<'a, 'ctx, 'env>(
            let result = bd.build_float_add(lhs, rhs, "add_float");

            let is_finite =
-                call_bitcode_fn(NumIsFinite, env, &[result.into()], &bitcode::MATH_IS_FINITE)
+                call_bitcode_fn(NumIsFinite, env, &[result.into()], &bitcode::NUM_IS_FINITE)
                    .into_int_value();

            let then_block = context.append_basic_block(parent, "then_block");
@ -3127,7 +3133,7 @@ fn build_float_binop<'a, 'ctx, 'env>(
            let result = bd.build_float_add(lhs, rhs, "add_float");

            let is_finite =
-                call_bitcode_fn(NumIsFinite, env, &[result.into()], &bitcode::MATH_IS_FINITE)
+                call_bitcode_fn(NumIsFinite, env, &[result.into()], &bitcode::NUM_IS_FINITE)
                    .into_int_value();
            let is_infinite = bd.build_not(is_finite, "negate");

@ -3257,10 +3263,10 @@ fn build_float_unary_op<'a, 'ctx, 'env>(
            env.context.i64_type(),
            "num_floor",
        ),
-        NumIsFinite => call_bitcode_fn(NumIsFinite, env, &[arg.into()], &bitcode::MATH_IS_FINITE),
-        NumAtan => call_bitcode_fn(NumAtan, env, &[arg.into()], &bitcode::MATH_ATAN),
-        NumAcos => call_bitcode_fn(NumAcos, env, &[arg.into()], &bitcode::MATH_ACOS),
-        NumAsin => call_bitcode_fn(NumAsin, env, &[arg.into()], &bitcode::MATH_ASIN),
+        NumIsFinite => call_bitcode_fn(NumIsFinite, env, &[arg.into()], &bitcode::NUM_IS_FINITE),
+        NumAtan => call_bitcode_fn(NumAtan, env, &[arg.into()], &bitcode::NUM_ATAN),
+        NumAcos => call_bitcode_fn(NumAcos, env, &[arg.into()], &bitcode::NUM_ACOS),
+        NumAsin => call_bitcode_fn(NumAsin, env, &[arg.into()], &bitcode::NUM_ASIN),
        _ => {
            unreachable!("Unrecognized int unary operation: {:?}", op);
        }
--- a/compiler/gen/src/llvm/build_str.rs
+++ b/compiler/gen/src/llvm/build_str.rs
@ -1,4 +1,4 @@
-use crate::llvm::build::{ptr_from_symbol, Env, InPlace, Scope};
+use crate::llvm::build::{call_bitcode_fn, ptr_from_symbol, Env, InPlace, Scope};
 use crate::llvm::build_list::{
    allocate_list, build_basic_phi2, empty_list, incrementing_elem_loop, load_list_ptr, store_list,
 };
@ -7,6 +7,8 @@ use inkwell::builder::Builder;
 use inkwell::types::BasicTypeEnum;
 use inkwell::values::{BasicValueEnum, FunctionValue, IntValue, PointerValue, StructValue};
 use inkwell::{AddressSpace, IntPredicate};
+use roc_builtins::bitcode;
+use roc_module::low_level::LowLevel;
 use roc_module::symbol::Symbol;
 use roc_mono::layout::{Builtin, Layout};

@ -27,19 +29,19 @@ pub fn str_concat<'a, 'ctx, 'env>(
    let second_str_ptr = ptr_from_symbol(scope, second_str_symbol);
    let first_str_ptr = ptr_from_symbol(scope, first_str_symbol);

-    let str_wrapper_type = BasicTypeEnum::StructType(collection(ctx, env.ptr_bytes));
+    let ret_type = BasicTypeEnum::StructType(collection(ctx, env.ptr_bytes));

    load_str(
        env,
        parent,
        *second_str_ptr,
-        str_wrapper_type,
+        ret_type,
        |second_str_ptr, second_str_len, second_str_smallness| {
            load_str(
                env,
                parent,
                *first_str_ptr,
-                str_wrapper_type,
+                ret_type,
                |first_str_ptr, first_str_len, first_str_smallness| {
                    // first_str_len > 0
                    // We do this check to avoid allocating memory. If the first input
@ -72,7 +74,7 @@ pub fn str_concat<'a, 'ctx, 'env>(
                            second_str_length_comparison,
                            if_second_str_is_nonempty,
                            if_second_str_is_empty,
-                            str_wrapper_type,
+                            ret_type,
                        )
                    };

@ -591,3 +593,34 @@ fn str_is_not_empty<'ctx>(env: &Env<'_, 'ctx, '_>, len: IntValue<'ctx>) -> IntVa
        "str_len_is_nonzero",
    )
 }
+
+/// Str.countGraphemes : Str -> Int
+pub fn str_count_graphemes<'a, 'ctx, 'env>(
+    env: &Env<'a, 'ctx, 'env>,
+    scope: &Scope<'a, 'ctx>,
+    parent: FunctionValue<'ctx>,
+    str_symbol: Symbol,
+) -> BasicValueEnum<'ctx> {
+    let ctx = env.context;
+
+    let sym_str_ptr = ptr_from_symbol(scope, str_symbol);
+    let ret_type = BasicTypeEnum::IntType(ctx.i64_type());
+
+    load_str(
+        env,
+        parent,
+        *sym_str_ptr,
+        ret_type,
+        |str_ptr, str_len, _str_smallness| {
+            call_bitcode_fn(
+                LowLevel::StrCountGraphemes,
+                env,
+                &[
+                    BasicValueEnum::PointerValue(str_ptr),
+                    BasicValueEnum::IntValue(str_len),
+                ],
+                &bitcode::STR_COUNT_GRAPEHEME_CLUSTERS,
+            )
+        },
+    )
+}
--- a/compiler/gen/tests/gen_str.rs
+++ b/compiler/gen/tests/gen_str.rs
@ -202,4 +202,18 @@ mod gen_str {
    fn empty_str_is_empty() {
        assert_evals_to!(r#"Str.isEmpty """#, true, bool);
    }
+
+    #[test]
+    fn str_count_graphemes_small_str() {
+        assert_evals_to!(r#"Str.countGraphemes "å🤔""#, 2, usize);
+    }
+
+    #[test]
+    fn str_count_graphemes_big_str() {
+        assert_evals_to!(
+            r#"Str.countGraphemes "6🤔å🤔e¥🤔çppkd🙃1jdal🦯asdfa∆ltråø˚waia8918.,🏅jjc""#,
+            45,
+            usize
+        );
+    }
 }
--- a/compiler/module/src/low_level.rs
+++ b/compiler/module/src/low_level.rs
@ -5,6 +5,7 @@
 pub enum LowLevel {
    StrConcat,
    StrIsEmpty,
+    StrCountGraphemes,
    ListLen,
    ListGetUnsafe,
    ListSet,
--- a/compiler/module/src/symbol.rs
+++ b/compiler/module/src/symbol.rs
@ -670,6 +670,7 @@ define_builtins! {
        2 STR_IS_EMPTY: "isEmpty"
        3 STR_APPEND: "append"
        4 STR_CONCAT: "concat"
+        5 STR_COUNT_GRAPHEMES: "countGraphemes"
    }
    4 LIST: "List" => {
        0 LIST_LIST: "List" imported // the List.List type alias
--- a/compiler/mono/src/borrow.rs
+++ b/compiler/mono/src/borrow.rs
@ -520,7 +520,7 @@ pub fn lowlevel_borrow_signature(arena: &Bump, op: LowLevel) -> &[bool] {
    // - arguments that we may want to update destructively must be Owned
    // - other refcounted arguments are Borrowed
    match op {
-        ListLen | StrIsEmpty => arena.alloc_slice_copy(&[borrowed]),
+        ListLen | StrIsEmpty | StrCountGraphemes => arena.alloc_slice_copy(&[borrowed]),
        ListSet => arena.alloc_slice_copy(&[owned, irrelevant, irrelevant]),
        ListSetInPlace => arena.alloc_slice_copy(&[owned, irrelevant, irrelevant]),
        ListGetUnsafe => arena.alloc_slice_copy(&[borrowed, irrelevant]),
--- a/shell.nix
+++ b/shell.nix
@ -36,6 +36,8 @@ let
      # build libraries
      pkgs.rustc
      pkgs.cargo
+      pkgs.clippy
+      pkgs.rustfmt
      pkgs.cmake
      pkgs.git
      pkgs.python3