This is too biased for-docs IMHO*. I do agree with many points, documentation IS amazing, and you are very likely under-documenting things in your company. But documentation is not cheap to create, and specially it's not cheap to maintain. If you are not writing enough yes, sure, that's probably a great investment, but start bit by bit.
I've worked in multiple* companies where the problem was too much documentation, and of course everyone was afraid to update or ghasps* remove any piece of old documentation in case it was still useful. Imagine working on a codebase where 80% of the code was unused or commented out but no one dared changing it just in case (flashback to 2010 with 4000 lines of style.css).
I'd suggest to take a more holistic approach and treat documentation a lot like testing; for that prototype, probably just write the barebones documentation, for the production-ready new feature go all-in and write detailed documentation, tutorials, etc.
If you do want to go deeper with documentation, then you'll need a dedicated team (like a team of testers) that work exclusively on documentation. At some point it does make sense to hire only for that, and it can even be a differentiating point for your startup if done correctly.
For libraries, a ratio I've seen works pretty well is approx 1:3:5 for lines of code:tests:docs; you can do tests first, or even documentation first, but once everything is finished and if you count the amount of lines that's a decent ratio. Note that when counting "lines of docs" in an editor, a whole paragraph will count as just 1, so in reality there's a lot more docs.
I'm the CTO of Mintlify - we help other startups create their developer-facing documentation. We've been working in the documentation space for a little over a year now and I spend a lot of time thinking about documentation. I completely agree here.
Documentation is such a hard problem to "solve" if you're a fast-moving startup. You need a mixture of creating a documentation-first culture and acknowledging that documentation is difficult to maintain. Ultimately you end up creating processes to help people document their intent, decisions and the mission critical information.
There is also such a large range of types of documentation - varying scale from internal to external and technical to non-technical.
We started by creating https://writer.mintlify.com/ which really resonated with developers because it made it easier to write documentation, but it only helped generate documentation that was highly technical and close to the code. We decided to stay in the documentation space but try another vector and so now we're taking a crack at public-facing documentation - which in my opinion is a different can of worms than internal documentation. However as I'm building and growing my startup and I find myself continuously playing whack-a-mole and I definitely hope that we can build the foundation and expand to make it easier to maintain all different types of documentation.
I'm trying your extension out and it looks promising.
A tiny bit of feedback: the shortcut key is defined in settings, not keybindings which seems wrong? Also, the default is to override cmd+. which is an important shortcut already..
I really like the way documentation works in Rust: You basically write markdown in a special type of comment over the module, function, datatype or method you wanna document and then you can convert that into documentation automatically.
Even better: if you have examples in code blocks in these docstrings per default they get tested as well, so if you don't update them, the tests will fail and you will notice.
In my eyes one of the biggest problems with keeping documentation up to date is that over time the mapping between the piece of code you are documenting and the place where you find it in the documentation becomes more complex, to a point where missing something is not unlikely. Rust's documentation-in-code-approach addresses this problem neatly.
I don't have experience with this in Rust but have come to passionately hate this kind of documentation in other language. I think all of pydoc, javadoc and, doxygen are all garbage. If one could apply them sensibly it would not be so much of a problem but then you have documentation nazis who force you to document every method and every parameter. This leads to hightly enlightening prose documentation that the get_height method "gets the height", and that its return value is the height. A more high level problem with this is that you get documentation that is just as fragmented as the code and where the high level usage of things is not explained at all. Also it clutters the code with many highly trivial remarks.
I am a documentation nazi. I hate it when people skip over documentation because something is obvious or trivial to them. Stuff isn't obvious or trivial to people who have to use your code.
get_height gets which height, outer or inner? Are there error values, e.g. 0 as "don't know any height"? Does it have side effects? Is it a stable and reliable part of the API or bound to change soon? Is it thread safe? Will it change any of its parameters? Who deallocates the return value? Do you need to hold a lock somwhere?
Of course it might be a good idea to group together get_height, get_width, get_diagonal and get_depth if the above is all the same for those. But having no documentation just because you think it is trivial that get_height gets some height from somewhere just means that you are sloppy and didn't think of all of the above. So your code shouldn't be touched with a 10-foot-pole imho.
My solution, which I personally hate but know of no alternative to, is a documentation template for each function asking the above questions (depending on runtime and language of course) that I give people to fill in. Until they learn...
- The return type makes it clear it can return an error
- The return value is typed in a way that makes it clear what the ownership is
Throw in a language like Rust that gives guarantees about thread safety and now the only thing left is if the API is stable or not. Which I would argue doesn't matter much at all since people will still end up depending on it regardless of the comment saying "This API might not be stable"
And the best part? My definition will never get outdated. If the assumptions change, the definition will also need to change (well, except for maybe the name)
You are absolutely right, and one should prefer languages that can give such guarantees wherever possible.
But often one doesn't have a choice. People still write software in inferior languages such as Javascript or Python, where you cannot even be sure about a return or parameter data type.
Quite often something is not obvious or trivial to someone who is examining a piece of code or using a library for the first time because it assumes the person already understands the context.
An example of what I mean: perhaps it is because I have a background in the sciences, but I assume that most properties have units. A property such as height certainly does have units. So is get_height() returning the height in pixels, inches, meters, or something else? I have also been bitten by graphics libraries that measure distances in unexpected (to me) way. Is the radius of the arc with line thickness 'n' using the inside radius, outside radius, center line, or something else? The person writing the original code may think the developer using their code down the road can test different assumptions, yet the reality is the number of combinations to test will rarely be trivial (and that is assuming they identify the correct parameters to test).
It's at the point where I refuse to even consider using libraries that leave out documentation for obvious things. Even comments like "gets the height" raises red flags since it is a demonstration that the author did not put any thought into what they are documenting.
I don't agree with this answer. I fully agree that documentation is important, needs to be correct and maintained. However, I do stand by the original poster saying that it is often a bad idea to enfroce javadoc style comments to autogenerate documentation. This often leads to low quality documentation.
Like you say, get_height is a trivial function, but still requires attention. Enforcing in-code docs is not going to help to have higher quality docs, quite the contrary. You often get low effort stuff, just to make the code checker happy.
And if you put in a peer review process to validate the in-code comment, it loses all power because you might as well use that step for decent documentation. get_height should be documented in a logical place, where it makes sense indeed, like grouped with get_width. But now you have the logical place to put your documentation, and the forced javadoc comment. That's double work, and one of them will be bad quality as a result of it.
Nobody is arguing for no documentation, but I am arguing for avoiding javadoc enforcements. My solution is much simpler: have a documentation check as part of the peer review process. Sure, have a template for the documentation, but don't make it strict. Ours is simple: all juniors are on documentation peer review as part of their onboarding. If they don't get it, it needs to be fixed.
Our peer review process is quite simple: is the documentation adapted, is there a relevant unit test (we actually have low UT coverage, we only do them for critical code and as part of bug fixing, so enforced frameworks make no sense for us), and naturally is the code quality itself ok
But many companies don't do those, yeah well thats how you get shit.
> get_height gets which height, outer or inner? Are there error values, e.g. 0 as "don't know any height"? Does it have side effects?
In my experience, if your documentation covers all these aspects it's guaranteed to be either wrong, misleading, or out of date on any of these details, and you better read the actual code to be sure.
In particular, the answer will often be "it depends on what the rest of the system does". Perhaps it delegates the actual calculation to an API, and this doc won't change when the API changes in a _mostly_ compatible way.
I mean, even with the best efforts given, code has bugs, words are vague, there's no way you should trust the dev who wrote the code to properly convey what it actually does.
Wouldn't it be more efficient to just read the source code? Also, the point the guy you're responding to is making is not that code shouldn't be documented, but that in-line documentation of this variety is not that great. You seem to be interpreting him to be saying that the code shouldn't be documented at all, which is not what he is saying.
It's true that there can be gotcha's in code which should be documented, but I don't think forcing a template on people is the solution. In fact, I wouldn't expect people to be better at documenting them with that in place.
Well, yes, you need to have manual or automatic checks for the presence of the template, and the correctness of the contents is often uncheckable by automated tests. But if things break and e.g. the filled-in documentation template incorrectly said "thread-safe: yes", it will be very easy to 'git blame' the culprit. That way you can at least slowly weed out the sloppy documenters, but I admit that this is tedious.
And, yes, I don't like this either and would like a better solution. But so far I didn't get any viable suggestions.
Yeah this is why I love literate programming. Being able to read a program with "narration" is so much nicer than just reading documentation piecemeal. Maintaining literate programs though is a quite difficult because you have to figure out where new code or changes fit in the overall narrative. I have a scraper I wrote in literate style and I only have to change it yearly. Each year I forget what I wrote and then I reread the program and make the necessary changes.
Literate programming is great until your code base becomes unwieldy. Then you need a README.md file which acts like a pointer to the correct entry points.
Then as the code grows, you need to document the architecture, add small gotchas, etc.
I am very code literate, but depending on the size of the codebase and how much smoke and mirrors are used, I might take longer to grok how things are interconnected by reading the code than it would take me if someone gave an high level overview in a paragraph or two.
I understand the sentiment, because I hate this in other languages too. But somehow in Rust it works. Of course there is also bad documentation of the kind you describe, but that is a different problem.
Bridging the gap between a prosaic high level explaination (how do the parts work together?) and a fine grained explaination of each part (what does that part do?) can be a challenge. Rust solves this somewhat by allowing you to do module level documentation (that can essentially look like a blog post, only that the examples get checked when the code is tested) and it lets you link to different entites.
I would always prefer a well written blog post, if people were able to keep the examples working and the code up to date. But experience shows they are not.
I'd rather have generated documentation that is true than a blog post where half of the examples won't work, because nobody bothered to update the post after the code changed. The first might at times be barren if done badly, the latter is downright misleading.
I am firmly on the opinion that every bit of public interface should get a documentation string.
You say "gets the height" is trivial, but that text tells not only what the function does, but also that the author couldn't think of anything else important to say. This is very different from no text at all, where you can't be certain if the author even thought about it.
IMO, enforcing an internal structure (AKA "you must document each parameter and the return value") is counter-productive, but enforcing the existence of the comment is very productive.
> Also it clutters the code with many highly trivial remarks.
I'd say that it "clutters" the code with markers for public elements and hard to understand ones. Those are actually valuable, and not clutter at all.
Anyway, if those markers are a large share of your lines, you may need to rethink your architecture. It's usually not valuable to have a lot of interface for trivial things.
You've put your finger on exactly why I don't like this kind of documentation.
Also, I don't tend to trust it as much, because it isn't actually what the program executes.
When I'm trying to read the source code I don't want to read about the source code -- I want to read the actual source code -- and if I keep coming across long multi-line idiotic comments then it breaks my flow and concentration.
I like the source code itself to be extraordinarily readable, with long and descriptive variable and method names, but I want it to be dense and packed into paragraphs of sense.
To the extent there are any comments at all they should be extremely short and completely clarifying -- they should not even partially overlap with information conveyed through function or variable names, for instance.
> Also, I don't tend to trust it as much, because it isn't actually what the program executes.
Maybe you don't know Rust, but it is a strongly typed language. If you go to a rust project and run `cargo doc --open` you will see exactly what the type system lays out. Sure if someone writes "changes the windows height and returns a new Window" on a change_window_height(height: Pixels) -> Window and it changes the windows width instead that is still wrong. But (A) you can click on "source" and see the actual code and (B) you are guaranteed that the function takes Pixels and spits out a Window
That being said I hate that kind of documentation in most other languages, not in Rust. In Rust it is basically an alternative view onto the same code with a (needed) emphasis on the realtions of the entities the type system describes. If there is prose that can be a nice extra, but cargo doc is even useful without a single comment.
This is API reference documentation. What you're missing is conceptual documentation and use case examples.
Conceptual documentation, the big picture, is important to convey the mental model implemented by an API. In applications, you can generally infer it from using the app, but it's not always easy, and it's indirect.
Use case examples string together multiple APIs, multiple domain objects, to achieve a high level business objective. When working on a project, you can sometimes get away without this - the existing code can be example enough to copy. You can end up with cargo culting, people copying things without understanding why. But if you have an API for third-party use, which needs documenting, you need to have either a well-seeded set of open source users, or a great set of examples.
Rustdoc supports conceptual documentation quite well, including testing the example code. A specialized page is better, but the pareto principle applies.
I often call this narrative documentation. It's the antidote to Chesterton's Fence and explains the thinking of the programmer who created the thing. How did that programmer expect people to use their tool.
Most popular languages have a version of this, including Java. This is only one level of documentation, and one must know a bit about what they don't know to use it effectively. In other words, if you already know that you need to use the foo function, then the foo function's documentation is great. If you don't even know which function or class/module to use, it isn't a great starting point.
I built a testing library on top of pytest based upon the idea of doing this mapping at an application level instead of a method/function/class level.
If you write tests in a strongly typed, non-turing complete markup (in this case, StrictYAML), you can then use it with a template and test artefacts (e.g. app screenshots) to generate readable how-to/tutorial docs which are guaranteed to stay up to date.
This isn't a new idea, but I find that people are often skeptical because there's a history of people getting their fingers burned by Gherkin's language design or YAML's weak typing (both of which are completely valid).
>I've worked in multiple* companies where the problem was too much documentation, and of course everyone was afraid to update or ghasps*
That must be nice. I've yet to work on a company where half my time wasn't trying to prod for some resource (be it internal code, a public 3rd party tool, or even the resource itself), sometimes playing a game of goose just to figure out who knows the author. I'd love too much documentation.
But I understand your point. The only thing worse than no documentation is wrong documentation, and outdated docs half the time can become outright wrong half the time, if it isn't simply encouraging outdated but functional practices. Tech writers are highly undervalued for that purpose.
I should also mention that the ability to properly search for docs is almost more important than the doc itself. Some companies had wikis but good luck searching for the right keywords if you didn't know the exact title. A properly categorized top level page could have helped a lot (and is probably easier/cheaper than integrating google like searchabilty into an internal database).
> The only thing worse than no documentation is wrong documentation
Yeah, that was the exact problem. One was a startup and when I joined it was all mostly up-to-date so it was great! But by the time I left (after 2 major migrations) most of it was out of date and a nightmare to find anything updated, any script that could still be run, etc.
It's an analogy to make the point - we delete old code that's not relevant any more. Imagine reading documentation where 80 percent of it is no longer relevant but is kept around "just in case".
I've worked in multiple* companies where the problem was too much documentation, and of course everyone was afraid to update or ghasps* remove any piece of old documentation in case it was still useful. Imagine working on a codebase where 80% of the code was unused or commented out but no one dared changing it just in case (flashback to 2010 with 4000 lines of style.css).
I'd suggest to take a more holistic approach and treat documentation a lot like testing; for that prototype, probably just write the barebones documentation, for the production-ready new feature go all-in and write detailed documentation, tutorials, etc.
If you do want to go deeper with documentation, then you'll need a dedicated team (like a team of testers) that work exclusively on documentation. At some point it does make sense to hire only for that, and it can even be a differentiating point for your startup if done correctly.
For libraries, a ratio I've seen works pretty well is approx 1:3:5 for lines of code:tests:docs; you can do tests first, or even documentation first, but once everything is finished and if you count the amount of lines that's a decent ratio. Note that when counting "lines of docs" in an editor, a whole paragraph will count as just 1, so in reality there's a lot more docs.
Note: I'm the creator of both https://documentation.page/ and https://documentation.agency/
* (only 2 "negative" paragraphs on a book-length article)