lhecker
a month ago
0
43
Hey all! I made this! I really hope you like it and if you don't, please open an issue: https://github.com/microsoft/edit

To respond to some of the questions or those parts I personally find interesting:

The custom TUI library is so that I can write a plugin model around a C ABI. Existing TUI frameworks that I found and were popular usually didn't map well to plain C. Others were just too large. The arena allocator exists primarily because building trees in Rust is quite annoying otherwise. It doesn't use bumpalo, because I took quite the liking to "scratch arenas" (https://nullprogram.com/blog/2023/09/27/) and it's really not that difficult to write such an allocator.

Regarding the choice of Rust, I actually wrote the prototype in C, C++, Zig, and Rust! Out of these 4 I personally liked Zig the most, followed by C, Rust, and C++ in that order. Since Zig is not internally supported at Microsoft just yet (chain of trust, etc.), I continued writing it in C, but after a while I became quite annoyed by the lack of features that I came to like about Zig. So, I ported it to Rust over a few days, as it is internally supported and really not all that bad either. The reason I didn't like Rust so much is because of the rather weak allocator support and how difficult building trees was. I also found the lack of cursors for linked lists in stable Rust rather irritating if I'm honest. But I would say that I enjoyed it overall.

We decided against nano, kilo, micro, yori, and others for various reasons. What we wanted was a small binary so we can ship it with all variants of Windows without extra justifications for the added binary size. It also needed to have decent Unicode support. It should've also been one built around VT output as opposed to Console APIs to allow for seamless integration with SSH. Lastly, first class support for Windows was obviously also quite important. I think out of the listed editors, micro was probably the one we wanted to use the most, but... it's just too large. I proposed building our own editor and while it took me roughly twice as long as I had planned, it was still only about 4 months (and a bit for prototyping last year).

As GuinansEyebrows put it, it's definitely quite a bit of "NIH" in the project, but I also spent all of my weekends on it and I think all of Christmas, simply because I had fun working on it. So, why not have fun learning something new, writing most things myself? I definitely learned tons working on this, which I can now use in other projects as well.

If you have any questions, let me know!

avesturaa month ago
It was very interesting to me that you liked Zig the most. Thank you for making this!
90s_deva month ago
1. What do you like about Zig more than Rust?

2. How did you ensure your Zig/C memory was freed properly?

3. What do you not like about Rust?

lhecker90s_deva month ago
> What do you like about Zig more than Rust?

It's been quite a while now, but:

- Great allocator support

- Comptime is better than macros

- Better interop with C

- In the context of the editor, raw byte slices work way better than validated strings (i.e. `str` in Rust) even for things I know are valid UTF8

- Constructing structs with .{} is neat

- Try/catch is kind of neat (try blocks in Rust will make this roughly equivalent I think, but that's unstable so it doesn't count)

- Despite being less complete, somehow the utility functions in Zig just "clicked" better with me - it somehow just felt nice reading the code

There's probably more. But overall, Zig feels like a good fit for writing low-level code, which is something I personally simply enjoy. Rust sometimes feels like the opposite, particularly due to the lack of allocators in most of its types. And because of the many barriers in place to write performant code safely. Example: The `Read` trait doesn't work on `MaybeUninit<u8>` yet and some people online suggest to just zero-init the read buffer because the cost is lower than the syscall. Well, they aren't entirely wrong, yet this isn't an attitude I often encounter in the Zig area.

> How did you ensure your Zig/C memory was freed properly?

Most allocations happened either in the text buffer (= one huge linear allocator) or in arenas (also linear allocators) so freeing was a matter of resetting the allocator in a few strategical places (i.e. once per render frame). This is actually very similar to the current Rust code which performs no heap allocations in a steady state either. Even though my Zig/C code had bugs, I don't remember having memory issues in particular.

> What do you not like about Rust?

I don't yet understand the value of forbidding multiple mutable aliases, particularly at a compiler level. My understanding was that the difference is only a few percent in benchmarks. Is that correct? There are huge risks you run into when writing unsafe Rust: If you accidentally create aliasing mutable pointers, you can break your code quite badly. I thought the language's goal is to be safe. Is the assumption that no one should need to write unsafe code outside of the stdlib and a few others? I understand if that's the case, but then the language isn't a perfect fit for me, because I like writing performant code and that often requires writing unsafe code, yet I don't want to write actual literal unsafe code. If what I said is correct, I think I'd personally rather have an unsafe attribute to mark certain references as `noalias` explicitly.

Another thing is the difficulty of using uninitialized data in Rust. I do understand that this involves an attribute in clang which can then perform quite drastic optimizations based on it, but this makes my life as a programmer kind of difficult at times. When it comes to `MaybeUninit`, or the previous `mem::uninit()`, I feel like the complexity of compiler engineering is leaking into the programming language itself and I'd like to be shielded from that if possible. At the end of the day, what I'd love to do is declare an array in Rust, assign it no value, `read()` into it, and magically reading from said array is safe. That's roughly how it works in C, and I know that it's also UB there if you do it wrong, but one thing is different: It doesn't really ever occupy my mind as a problem. In Rust it does.

Also, as I mentioned, `split_off` and `remove` from `LinkedList` use numeric indices and are O(n), right? `linked_list_cursors` is still marked as unstable. That's kind of irritating if I'm honest, even if it's kind of silly to complain about this in particular.

In all fairness, what bothers me the most when it comes to Zig is that the language itself often feels like it's being obtuse for no reason. Loops for instance read vastly different to most other modern languages and it's unclear to me why that's useful. Files-as-structs is also quite confusing. I'm not a big fan of this "quirkiness" and I'd rather use a language that's more similar to the average.

At the end of the day, both Zig and Rust do a fine job in their own right.

ameliaquining lheckera month ago
The design intent of unsafe Rust is that its usage should be rare and well-encapsulated, but supported in any domain. Alleviating a performance bottleneck is a fine reason to use unsafe, as long as it only appears at the site of the bottleneck and doesn't unnecessarily leak into the rest of the codebase.

The most basic reason why you can't have unrestricted mutable aliasing is because then the following code, which contains a use-after-free bug, would be legal:

    let mut val = Some("Hello".to_owned());
    let outer_mut = &mut val;
    let inner_mut = val.as_mut().unwrap();
    *outer_mut = None;
    println!("{}", inner_mut);
If, as is sometimes the case, you need some kind of mutable aliasing in your program, the intended solution is to use an interior-mutability API (which under the hood causes LLVM's noalias attribute to be omitted). Which one to use depends on the precise details of your use case; some (e.g., RefCell) carry performance costs, while others (e.g., Cell) are zero-cost but work only for certain types or access patterns. Having to figure this out is annoying, but such is the price of memory safety without runtime garbage collection. In the worst-case scenario you can use UnsafeCell, which as the name suggests is unsafe, but works with any type with no performance cost. UnsafeCell is also a little bit heavy on boilerplate/syntactic salt, which people used to C sometimes find annoying; there isn't that much drive to fix this because, as per above, it's supposed to be rarely used.

The "few percent in benchmarks" thing sounds like it's referring to the rule that it's UB to use unsafe code to make aliased &mut references even if you don't actually use those references in a problematic way. Lifting that rule would preclude certain compiler optimizations, and as per above would not fix the real problem; you still couldn't have unrestricted mutable aliasing. It would only alleviate the verbosity cost, and that could be done in a different way without the performance cost (like by adding special concise syntax for UnsafeCell) if it were deemed important enough.

The uninitialized-memory situation is pretty widely agreed to be unsatisfactory. Unfortunately it is hard to fix. Ideally the compiler would do flow-control analysis so that you can read from memory that was uninitialized only if it has definitely been written to since then. Unfortunately this would be a big complicated difficult-to-implement type system feature, and the need to make it unwind-safe (analogous to the concept of exception safety in C++) adds much, much more complication and difficulty on top of that. You could imagine an intermediate solution, wherein reading from uninitialized memory gets you a valid-but-unspecified value of the applicable type instead of UB, but that also has some difficulties, such as unsoundness in conjunction with MADV_FREE; see https://internals.rust-lang.org/t/freeze-maybeuninit-t-maybeuninit-t-for-masked-reads/17188 if you're curious for more details.

Again, the point here is not "the current design is optimal", it's "improving on the current design is a difficult engineering problem that no one has solved yet".

I think people who need cursors over linked lists use a third-party library from crates.io for this, but it's quite reasonable to think that the standard library should have this. Most of the time when a smallish feature like that remains unstable it's because nobody has cared enough about it to shepherd it through the stabilization process (perhaps because it's not a hard blocker if you can use the third-party library instead). Possibly that process is too slow and heavyweight, but of course enacting a big process change in a massively multiplayer engineering project that's governed by consensus is an even harder problem.

Cloudef lheckera month ago
I think files as struct makes lots of sense. As it doesnt have to treat files in any special way then.
throwawaymathsCloudefa month ago
i had been dreaming about files as structs for about a two decades and Zig made it happen.

i will say the struct files (files which are not a namespace struct.. that is, they have field values) are a bit weird, but at least the syntax is consistent with an implicit surrounding bracket.

steveklabnika month ago
I’d love to hear about the use of nightly features. I haven’t had time to dig into the usage, but that was something I was surprised by!
lheckersteveklabnika month ago
Up until around 2 months ago the project actually built with stable Rust. But as I had to get the project ready for release it became a recurring annoyance to write shims for things I needed (e.g. `maybe_uninit_fill` to conveniently fill the return value of my arena allocator). My breaking point was the aforementioned `LinkedList` API and its lack of cursors in stable Rust. I know it's silly, but this, combined with the time pressure, and combined with the lack of `allocator_api` in stable, just kind of broke me. I deleted all my shims the same day (or sometime around it at least), switched to nightly Rust and called it a day.

It definitely helped me with my development speed, because I had a much larger breadth of APIs available to me all at once. Now that the project is released, I'll probably stay with the nightly version for another few months until after `let_chains` is out in stable, because I genuinely love that quality-of-life feature so much and just don't want to live without it anymore. Afterward, I'll make sure it builds in stable Rust. There's not really any genuine reason it needs nightly, except for... time.

Apropos custom helpers, I think it may be worth optimizing `Vec::splice`. I wrote myself a custom splice function to reduce the binary size: https://github.com/microsoft/edit/blob/e8d40f6e7a95a6e19765ff19621cf0d39708f7b0/src/helpers.rs#L199-L253

The differences can be quite significant: https://godbolt.org/z/GeoEnf5M7

steveklabnik lheckera month ago
Thank you!

> I know it's silly,

Nah, what's silly is LinkedList.

> I think it may be worth optimizing `Vec::splice`.

If this is upstream-able, you should try! Generally upstream is interested in optimizations. sort and float parsing are two things I remember changing significantly over the years. I didn't check to see what the differences are and how easy that actually would be...

karunamurtia month ago
Why not webassembly ABI?
lheckerkarunamurtia month ago
I'm not familiar with that, so I can't say. If you have any links on that topic, I'd appreciate it.

Generally speaking, the requirement on my end is that whatever we use is as minimal as it gets: Minimal binary size overhead and minimal performance overhead. It also needs to be cross-platform of course. This for instance precludes the widely used WinRT ABI that's being used nowadays on Windows.

karunamurti lheckera month ago
Maybe something like https://www.codecentric.de/en/knowledge-hub/blog/plug-in-architectures-webassembly ?

Webassembly is the binary spec for the web. But now everyone is using that because it's portable and lightweight.

The idea is you can create plugin using any language that compiles into webassembly. C, Rust, Pascal, Go, C++. Compile once and it should work in Windows, Linux and Mac. No need to compile to multiple architecture.

Performance should be great near native, but I guess there's going to be a problem with the added webassembly runtime size. Here is a runtime with estimated sizes: https://github.com/bytecodealliance/wasm-micro-runtime

And it's sandboxed too, so should be secure.

karunamurtikarunamurtia month ago
And I guess forcing FFI for plugins is going to be a headache for many plugin authors.
lheckerkarunamurtia month ago
What I imagined is that people could load runtimes like node.js as a plugin itself in order to then load plugins written in non-native languages (JavaScript, Lua, etc.; node.js being an extreme example). I wonder if WASM could also be one such "adapter plugin"?

But plugins are still a long way off. Until then, I updated the issue to include your suggestion (thank you for it!): https://github.com/microsoft/edit/issues/17

karunamurti lheckera month ago
Yeah the choice usually js or lua. Webassembly is just a new option.
qingcharlesa month ago
Why Rust over a compiled .NET lang? (e.g. C#)
lheckerqingcharlesa month ago
Pretty much exclusively binary size. Even with AOT C# is still too large. Otherwise, I wouldn't have minded using it. I believe SIMD is a requirement for writing a performant editor, but outside of that, it really doesn't need to be a language like C or Rust.
int_19h lheckera month ago
Is there any context in which .NET runtime wouldn't be available on Windows (even if an older version, e.g. 4.x)? Because when you can rely on that and thus you don't need to do AOT, the .exe size for C# would likely be an order of magnitude smaller.
lheckerint_19ha month ago
I intended for this editor to be cross-platform and didn't want to take on a large runtime dependency that small Docker images or similar may not be willing to bundle.
anacrolixa month ago
The quirkiness of Zig is real. I'd love for Zig to win out but it's just too weird, and it's not progressing in a consistent direction. I can appreciate you falling back to Rust.
dnauticsanacrolixa month ago
> it's not progressing in a consistent direction

I've maintained a project in zig since either 0.4 or 0.5 and i dont think this is the case at all. supporting 0.12 -> 0.13 was no lines of code, iirc, and 0.13->0.14 was just making sure my zig parser could handle the new case feature (that lets you write a duff's device).

zig may seem quirky but it's highly internally consistent, and not far off from C. every difference with c was made for good reasons (e.g. `var x:u8` vs `char x` gives you context free parsing)

i would say my gripes are:

1. losing async

2. not making functions const declarations like javascript (but i get why they want the sugar)

pabs3a month ago
Can you say more about the chain of trust issue? Does Rust also not have that problem? Or are you using mrustc to bootstrap rustc?
lheckerpabs3a month ago
Indeed, we have our own bootstrapped Rust toolchain internally. I think this has to do with (legal) certifications, but I'm not entirely sure about that.
pabs3 lheckera month ago
BTW, are you aware of the Bootstrappable Builds folks achievements? Starting with only an MBR worth of commented machine code, plus a ton of source code, they build up to a Linux distro.

https://bootstrappable.org/ https://lwn.net/Articles/983340/ https://github.com/fosslinux/live-bootstrap/blob/master/parts.rst https://stagex.tools/

eviksa month ago
> So, why not have fun learning something new, writing most things myself? I definitely learned tons working on this, which I can now use in other projects as well.

Because presumably you should have been doing it mostly for the benefit of Windows users, and wasting time because it’s a fun personal learning exercise means those users would suffer getting an underpowered app

fuzzfactoreviksa month ago
At the same time I think some of the most brilliant things to come from Microsoft are products of individual initiative, and when the project ends up compromised for some reason I get the idea that it's some kind of institutional higher-ups that do the damage after the fact.

Maybe just some residual instinct left over from times past when more people like Ballmer were still prominent, and they were not as user-enabling as today in some ways?

dmos62eviksa month ago
Wow, what a disheartening comment. Did GP not explain why they did what they did? This is an open-source project, the author expressed joy in working on it, and you have the heart to tell him off. This is far below what I expect of HN.
eviksdmos62a month ago
Did the comment not explain what the issue with that explanation is?

But maybe if you didn't misrepresent the situation so much you wouldn't lose your heart. This is not some tiny personal open source project where fun can be the only valid reason, but "will ship as part of Windows 11!", so millions of devices in a professional OS. Are your expectations so poorly calibrated that you have none in both cases? Why are they higher re. a forum comment?

dmos62eviksa month ago
What led you to say that the author did not have users' interests at heart? What led you to imply that there's something wrong with reimplementing something or having fun or whatever it is you disliked you so much? What leads you think that a person working on something delivered with Windows 11 deserves less respect than a person working on a less used system? Or, do you consider what you said neutral, well argued criticism?
eviksdmos62a month ago
What led you to continue to misrepresent... everything?

Why did you make up a point about a person deserving respect and pass it as my thought? Could you not come up with a more coherent difference between those two situations yourself?

Why are you asking a question about the motivation if you don't even understand "whatever" it is I disliked?

Why did you make up the implication that rejects having fun?

Why are you making it personal in the first place?

How can criticism be neutral when it's... critical?

What kind of well argued thing do you expect in a... single sentence to even ask such a question?

Again, why is there such a huge mismatch in your expectations re. a comment and a professional app?

dmos62eviksa month ago
I didn't intentionally misrepresent your comment, but I am open to having misunderstood it. Also, me answering with questions didn't help.

> > So, why not have fun learning something new, writing most things myself? I definitely learned tons working on this, which I can now use in other projects as well. > > Because presumably you should have been doing it mostly for the benefit of Windows users, and wasting time because it’s a fun personal learning exercise means those users would suffer getting an underpowered app

Would you care to elaborate on what intention you understood the author to have, which aspect of author's work you deemed as a waste of time, and why do you think the resulting app is underpowered for Windows 11 users?

I would also be interested in what mismatch you saw in my expectations as regards your original comment and a professional app.

I realise that we got off on a bad foot, but, if you care to, we can try and restart the conversation.

avesturaeviksa month ago
Fun-shaming a passionate developer who, beyond their job description, delivered an editor that checks all the required boxes (small binary, fast, ssh-support, etc.) in just 4 months, while working on weekends and even Christmas, and calling it "wasting time" is incredibly upsetting. I'm grateful to work with people who value that kind of initiative.
eviksavesturaa month ago
You're making it all up, he's a terminal product manager, so it is not beyond job description.

Of course the app doesn't check all boxes, plenty of features other editors have had years to add simply couldn't be added even if you work on Christmas (by the way, is also entirely driven by the NIH decision to write from scratch and not than that doing a 3 language mock)

And I'm not shaming the fun, I'm saying that that's not a good justification for shipping a worse app to the millions in a professional setting

MonkeyCluba month ago
I'm really glad this wasn't killed off with the recent Microsoft layoffs!
mwcampbella month ago
I wonder if you used GitHub Copilot or some other LLM-based code generation tool to write any of the code. If not, that's a lot of code to write from scratch while presumably under pressure to ship, and I'm impressed.
lheckermwcampbella month ago
I did use Copilot a lot, just not its edit/agent modes. I found that they perform quite poorly on this type of project. What I use primarily is its autocompletion - it genuinely cured my RSI - and sometimes the chat to ask a couple questions.

What you expressed is a sentiment I've seen in quite a few places now. I think people would be shocked to learn how much time I spent on just the editing model (= cursor movement and similar behavior that's unique to this editor = a small part of the app) VS everything else. It's really not all that difficult to write a few FFI abstractions, or a UI framework, compared to that. "Pressure to ship" is definitely true, but it's not like Microsoft is holding a gun to my chest, telling me to complete the editor in 2 months flat. I also consider it important to not neglect one's own progress at becoming more experienced. How would one do that if not by challenging oneself with learning new things, right? Basically, I think I struck a balance that was... "alright".

alexrpa month ago
> Since Zig is not internally supported at Microsoft just yet (chain of trust, etc.)

Is there something about Zig in particular that makes this the case, or is it just an internal politics thing?

lheckeralexrpa month ago
I don't know why, but I'm quite certain it's neither of the two. If anything, it has probably to do with commitment: When a company as large as MS adopts a new language internally, it's like spinning up an entire startup internally, dedicated to developing and supporting just that new language, due to the scale at which things are run across so many engineers and projects.
stianhoilanda month ago
Thanks for undertaking this project, as well as making WT such an awesome app!

I already expressed my appreciation on the repo, but was promptly shushed by your colleague for Incitement Of A Language War, hehe.

I'm impressed by the architecture and implementation choices, especially the gap buffer and cursor movement. It seems we've independently arrived at the same kinds of conclusions on how to min-max a text editor: minimal concepts with maximal functionality.

Others have asked about Zig. I would love to hear more about the work you did in C. Did you start in C? What are some reasons why you didn't continue with C? If you had continued in C, with hindsight, what would have been most annoying? What was clearly better in C? Again, with hindsight, what would have been the best parts of following through with C? I see that you are C-cultured as well (Chris Wellons' blog) and some of the upsides of Zig that you mention I would have guessed you could have elegantly solved in C using Chris' insights. I'm very curious how with such expert advice available you still sought elsewhere and preferred it. Looking forward to hear about the C-side of the story.

Good luck with the project, and see you on the repo :)

marler8997a month ago
I checked the git history to see if you included the Zig version but looks like first revision is rust...

In the Zig version did you use my zigwin32 project or did you go with something else? Also, how did you like the Zig build system vs rusts?

lheckermarler8997a month ago
Back then (a year ago?) I simply included the Windows.h header into a Zig file. Is that not possible anymore? It worked great back then for me IIRC!

Overall, I liked the build system. What I found annoying is that I had to manually search for the Windows SDK path in build.zig just so I can addIncludePath it. I needed that so I can add ICU as a dependency.

The only thing that bothered me apart from that was that producing LTO'd, stripped release builds while retaining debug symbols in a separate file was seemingly impossible. This was extra bad for Windows, where conventionally debug information is always kept in a separate file (a PDB). That just didn't work and it'd be great if that was fixed since back then (or in the near term).

fastasucana month ago
Will you implement theming?
Gb35643a month ago
Thanks for this. Don't ask why but i just defaulted to Edit on Linux. I noticed there's no locking for edited files. Not even a notification saying "This file has been modified elsewhere since you opened it. Do you still want to save"

Can you confirm this? Is it some thing you intend to add? Curious to know why, if the answer is no