Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The program is the database is the interface (scattered-thoughts.net)
137 points by jgrodziski on Feb 13, 2023 | hide | past | favorite | 76 comments


Since its inception, computers were based on the model of a machine traversed by data that are processed like sausages in a mincer, as data was fed directly on hardware which had code assembled as physical wires.

However, the software part of the machines is made of the same material as the data, so they are much closer to what current programming languages seem to imply.

It is natural that they tend to converge, especially in Lisp-like languages, into a type of program where the structure has code and data intermingled, which organically co-evolve in the directions in which problem solving takes it.

Instead of a machine with a processing pipeline, a closer metaphor might be the phloem of plants, the living cells that transport food, and which form part of the plant's structure and grow along with it.

(BTW I've coined a name for that kind of self-rendered cells, processed in place as collections of data sharing the same structure; I call them 'wits' - as in, the minimum unit of meaning; in parallel to bits as the minimum unit of information. And if you look closely, you'll begin to see that a lot of modern programming systems have them everywhere. So you may want to use this term 'a wit' to describe the concept explained in the article.)


> self-rendered cells, processed in place as collections of data sharing the same structure

Isn't this .. the spreadsheet? I was entertained watching this drift towards a set of cells in a grid some of which contain formulae and some of which contain data, used for accounting purposes, and wondering when the lightbulb was going to come on.

(Funnily enough I've written my own transaction-annotating system lately. It's in python and stores the data in CSV, having got it from the Nordigen bank interface. EEA programmers may find Nordigen extremely useful: https://nordigen.com/en/coverage/)


> Isn't this .. the spreadsheet?

As long as by "spreadsheet" you don't mean specifically a 2D grid of cells, but rather a reactive computational graph that just happens to be rendered as 2D grid, in order to not waste so much screen space the way reactive notebooks do :).

OK, the waste thing is not that clear-cut. Spreadsheets give you 2D instead of 1D workflow, but then all spreadsheets I know aren't able to store an array of values in a cell, leading to lots of space waste from every array element taking its own cell.

I hope one day someone will figure out how to merge reactive notebooks (Observable-style, not Jupyter-style) and spreadsheets - both are the same thing underneath, it's just the presentation layer that's different, with complementary pros and cons.


https://en.wikipedia.org/wiki/Lotus_Improv

I'm still not sure why this model didn't win out, aside from entrenched use of other spreadsheets.


> I'm still not sure why this model didn't win out

https://news.ycombinator.com/item?id=30872413


Ah, yeah, that explains why Microsoft has so many tools that seem to have one end that's a claw hammer.


MS Excel has largely replicated the Improv model as an optional mode within formulas, allowing you to create Ranges as names for groups of cells and allowing you to use the names as variables in formulas instead of coordinates.

However, it lacks the flexibility of a true dataflow model. They incorporated the full model into the Power Pivot plugin, and later migrated it to a standalone app as Power BI. It's a reasonably complete paradigm, but much less well-known than the spreadsheet.

https://spreadsheeto.com/power-pivot-vs-bi/


I'd love to see a sort of hypercube model that also had the ability to perform better database-style summarizing. Pivot tables with more dimensions and power, perhaps.

I might be an idiot, and it might all be obvious in some way, but everything I try to do in excel is about 80% easy and the remaining 20% is hard because a lack of power in excel's formula language. From no-brainers like fixing strings to various things we take for granted elsewhere in programming, like our own defined functions. Sure, if you extend excel, you get some of it, but it should be able to do those things without dropping to VBA or add-ins. Meanwhile, its default functionality leaves a lot to be desired, especially quirks like how it interprets numbers, dates and strings.

I feel like I can explain to someone how to copy and paste things in excel so that formulas get applied, but having to teach them PowerQuery is beyond my ken.


> I'd love to see a sort of hypercube model that also had the ability to perform better database-style summarizing. Pivot tables with more dimensions and power, perhaps.

I'd say the recent data analysis tools based on notebooks and data lakes satisfy those requirements. A library like Pandas provides the cube programming model of data dimensions, a column-oriented database like Hadoop gives you efficient storage, and a Jupyter notebook is the user-friendly GUI.

It's certainly more complex than a spreadsheet, but the code parts may be encapsulated to not scare your users.


> I hope one day someone will figure out how to merge reactive notebooks (Observable-style, not Jupyter-style) and spreadsheet

I'm working on that exactly - I have several conceptual designs for presentation and interaction, that's where my wit concept comes from as an abstraction to represent cells in the reactive graph, which can be seen either as an array or as a single cell with a label and a prototypic value. Once you liberate the concept from the 2D grid, dependencies between variables create lots of interesting navigation and presentation problems and expanded possibilities.

Unfortunately I don't have much time for side projects and my programming skills are a bit rusty, so I haven't advanced much in terms of actual implementation. Moreover, my 'universal' requirement of having it available as a system-wide background service would take a lot of work :-(


Check out hex’s reactive notebooks

https://hex.tech/blog/hex-two-point-oh/


> Isn't this .. the spreadsheet?

You nailed it, it's a spreadsheet - but one that doesn't constraint all data to live on the same application window, but could access and transform it throughout the whole system - just like copy/paste can transfer data from one program to another, where it can be transformed with a different toolset.

Copy/paste is the most fundamental tool in the toolbox of the end user, since it's the only universal* way to transfer data between apps without accessing an API programmatically, so I think it's undeservedly neglected as a tool to build real life practical workflows.

Incidentally, spreadsheets are one of the most versatile and successful End-User Development tools, allowing users to build information systems tailored to their business purposes. A system combining those design capabilities with the flexibility of building system-wide workflows could transcend the current model of siloed apps.

* Intents on mobile devices are similar, but they are not universal - many times the app to which you want to transfer data is not recognized as a valid target for the shared data.


The Unix pipe seems like scriptable copy & paste.


It's not, because you can't easily spawn transient pipes between running processes. The OS probably allows this, but I've never seen it exposed in the UI layer.

Pipes are primarily used to build ad-hoc pipelines that run in batch mode. Copy&paste is a tool for moving specific bits of data on demand between two running programs, or within the same program. Copy&paste interaction model is "please take this and put it there", and the ability to select the this and the there is just as important as transferring the data itself.


> can't easily spawn transient pipes between running processes.

Yeah, while this is theoretically possible it's unmanageable in practice (the "washing line" protocol allows you to send a UNIX pipe file descriptor to another process over a socket).

I suspect the nearest thing that made it into production was OLE.


I'd say the closest thing in modern systems would be the Stream metaphor (a.k.a. Observer design pattern) in functional reactive programming.

This is the construct used to program the spreadsheet's reactive behavior, after all. You could think of users creating spreadsheet formulas as declaring a pipeline at runtime between two datasets, the origin range and the target range.

What is missing is a way to create "spreadsheet functions" among different applications. Online computational notebooks a la Jupyter are almost there, but they still need to share the same runtime kernel to transfer data between different books.

A combination of notebooks and web hooks should do the trick, methinks, but it needs someone to assemble this brittle system into a robust product.


Maybe by manipulating /proc/12345/fd/* ? Can these be manipulated or are they read-only?


it's one of the next thing in computing IMO


Possibly related: "Seme, the smallest unit of meaning recognized in semantics". [1]

[1] https://en.wikipedia.org/wiki/Seme_(semantics)


Thanks, I'll take a look. A wit would be the smallest unit of meaning in computing, then. I.e. the smallest part that needs processing beyond that of a primitive.


I forgot to mention that I like your idea. And yeah, semantics and computing are related in a complex way, but certainly are not the same thing.

I was distracted, trying to think of where I'd seen something like that in science fiction, and now I think (but can't check right now) that Robert Heinlein's novelette "Gulf" had something about the smallest unit of thought; the "psychon" or some such.

There may also have been something in Keith Laumer's "The Great Time Machine Hoax", but I'm less sure about that.


I should point out that I'm not entirely serious about the concept of wit really being any measurable amount of meaning, I just wanted a catchy name that sounds cool :-)

The true minimum meaningful computation is probably the lambda function, anyway.


Certainly lambda is Turing equivalent, so sure. Seems like you could say the same for other things that are Turing equivalent, OTOH. Lukasiewicz logic [1] ; MOV [2] ; Semi-Thue string rewriting [3], and so on.

But since such things usually suffer from the Turing Tarpit [4], perhaps there's one that has some kind of minimax to be ideally terse and require ideally-few steps for its Turing equivalency, similar to the way that radix 3 (closest to e) has been claimed to have been proven to be the ideally efficient representation system. [5]

(I say "claimed" simply because I recently saw a claim that there's a loophole in the usual reasoning.)

[1] http://en.wikipedia.org/wiki/%C5%81ukasiewicz_logic#Real-val...

[2] https://en.wikipedia.org/wiki/One-instruction_set_computer

[3] https://en.wikipedia.org/wiki/Semi-Thue_system

[4] https://en.wikipedia.org/wiki/Turing_tarpit

[5] https://en.wikipedia.org/wiki/Radix_economy


If I had a penny everytime a system's book/teacher said "systems are like living cells, they have interaction to themselves and the medium"...


Because it's a valid metaphor :-)

Computation started at the Jacquard loom, so engineers have a tradition of seeing it as a machine crunching patterns.

But computers science, just like math, is a language so its deeply grounded in semantics. And semantics grow and evolve from the actions of each individual speaker, without a central processing unit controlling the process, so "organic" is a good way to describe it.


...You've been listening to Alan Kay? STEPS: https://www.vpri.org/pdf/tr2012001_steps.pdf

VPRI implemented this dataflow GUI concept along with many other concepts in a minimal OS. Maru underpins the GUI stuff, and is highly flexible in itself.


> ...You've been listening to Alan Kay?

Actually yes :-) I agree that the original vision of computing pioneers about "augmenting the human intellect" has yet to become reality, and the industry has been entrenched in a local minimum derived from the very early engineering practices. But all that is finally changing really fast.


Whenever I stumble on an article like that*, I can't stop wondering why do so many developers think that SQL is hard/clunky in comparaison?

*(any article where author seemingly rebuild a database engine from scratch with maybe 0,1% to 1% of features count present in any RDBMS including sqlite)


> why do so many developers think that SQL is hard/clunky in comparison?

> A database introduces a different data model - I have to translate between database data-types and my programming languages native data-types.

The key concept is explained in this sentence in the article. It doesn't matter how difficult or simple it is; the problem is that it's a different model, so it creates translation inefficiencies and makes it more difficult to reason about.

The object-relational impedance mismatch is a known problem; having data in one place and format, accessed through a single programming model will typically be easier than a separate service with a different language and model, no matter how efficient.


(my)Sql in itself can be quite powerful, although it seems it's advanced features are underused.

I wonder about going the other way, and turning SQL into a full scripting language.


> turning SQL into a full scripting language

That's what PL/SQL and T-SQL basically are. They work, but they have a somewhat weird programming model, halfway between programming and declarative sql but being neither, with table cursors and stateful SQL queries.

And of course, their scope is limited to interacting with the database itself, not the outer world.


Partial list for Postgres:

    • pl/pgsql (a pl/sql clone)
    • pl/perl 
    • pl/python 
    • pl/v8 (JS, LiveScript, and CoffeeScript)
    • pl/java
    • pl/r
    • pl/lua
    • pl/rust
    • pl/ruby
    • pl/sh
    • pl/scheme
    • pl/julia
Or at least that's where you can define arbitrary procedural/functional logic for result sets within a language oriented toward set theory and transformation.


> no matter how efficient

Well... these things matter a lot !


It does matter, but then it doesn't, if it makes your life as a programmer more difficult, which is the situation described here.

SQL lives on because it makes database storage extremely efficient, not because developers enjoy switching context to an entirely different programming language and runtime environment. That they have to use it doesn't mean they have to like using it.


HTML is one context. CSS becomes a context switch with a different mental model. JS becomes another. Backend in Python, Java, PHP, etc. becomes another. Access via GraphQL, grpc, REST, etc. is yet another.

SQL just adds one more to the list for that distinct problem domain. If you don't like SQL, that's not something universal to all devs nor due to the context switch. You either don't know SQL as well or just don't like it. That's fine, just not something to apply categorical terms that apply to "developers".

If you can orient your mind to related sets and subsets, all other languages (and especially ORMs) start looking like square pegs for round holes.

(I'm a developer who happens to like SQL's round peg quite a lot.)

Imagine craftspeople complaining they hate using the tool perfectly suited to a particular task. "I'd prefer to use this hammer against the handle of the screwdriver to smooth out this wood, because I hate the context switch to using the planer, and I'm just more comfortable with hammers and screwdrivers."


I really don't get why some people in software are so invested in the tooling they use as part of their identity.


I happen to like SQL but maybe I've been doing this for too long. Still:

> having data in one place and format, accessed through a single programming model will typically be easier than a separate service with a different language and model, no matter how efficient.

I would need examples of such a system. The closest I know of is... a RDBMS coupled with a decent ORM.


Relational databases are lovely! SQL is... clunky, but it's still a wonderful way of working with data, compared to hand-rolling imperative loops or functional transforms.

I've been thinking about this a lot in the past ~5 years. My current dream is not that we should be using SQL and RDBMS-es more in our systems, but the inverse: we should have relational database facilities built directly into our programming languages. Because when I look at the code I typically work with from a data processing angle, a lot of it is just hand-rolling small pieces of a relational database.

I'm talking small scale here. That random business object storing two maps and wrapping them in a domain-specific API? That's two tables with two columns each, possibly connected with something else by their primary keys. Or two tables you always JOIN in a specific way. Or really just one table. But instead of encoding this understanding directly, we write the lower-level implementation of it - two maps to store data (prematurely baking in an indexing strategy), and relations left implicit, smeared across the object's methods.


You might be interested in data-oriented design, nicely described in this talk by Andrew Kelley: https://vimeo.com/649009599

Treating data in code using the same basic layout that databases use is common in game programming. The usual example is structs of arrays as opposed to arrays of structs.

https://en.wikipedia.org/wiki/Data-oriented_design but I suggest watching the above talk first.


Same author made a good article exactly about this: https://www.scattered-thoughts.net/writing/against-sql


Ever hear the advice, "Beware the old man in a young man's game"?

Every couple of years a new upstart claims some fundamental flaw and illustrates how they would fix it. The industry has a lot of upheaval with new leaders emerging from the pack constantly.

But after almost fifty years, SQL is still there. Folks have ORM'd it countless times. Tried to apply it to their popular (and unpopular) language. Implemented new database styles with novel access methods.

And SQL is still just as strong if not stronger than ever. So either everyone just goes along with the old man out of habit despite the prospect of tremendous personal gain and advancement…

…or there's something important (or many things) the young upstarts just aren't recognizing about that old man—to their peril.


The author has several points any developer experienced with SQL would have noticed or agree with. Author here is far from being a "youngster" and has worked severals years maintening/improving RDBS.


Young upstarts refers to the proposed replacement technology, not the man.

SQL, the language, is the old man in a young man's game.


Imo there is a big difference between RDBs as a store of data, in which you're trapped by past modeling decisions and a query language that constrains you, and SQL as a query language which is decent enough.

SQL feels much better when you populate your DB with data from elsewhere, and use it as a tool to explore data you structured for your own purpose than it is when you try to ask questions to a model unfit for your purpose.


I will second the commenter who said that relational databases are lovely but SQL is clunky.

I'll take it a step further and say that SQL is a design flaw imposed on us by large database vendors that we have been paying the price for for the past 50 years.

There is a principle, included in SOLID, called "interface segregation" and it is the idea that having many, smaller, special-purposes interfaces is preferable to having fewer, larger more general-purpose interfaces.

To understand the principle complete, just think about the differences between a consumer toaster oven and the cockpit of a Boeing 747 jet airplane. One can be operated by a five year-old (with supervision) while the other requires years of training and specialized domain knowledge to operate.

In software, you can't get any more "general purpose" than a programming language.

Consider the complexity that SQL has brought us over the past half a century: SQL injection vulnerabilities, object relational mappers, complex and computationally-expensive queries that are difficult to understand let alone optimize, business logic hidden inside of stored procedures and on and on we go. Basically all of the problems and challenges that are inherent in any kind of human-crafted code can be found inside of SQL and it is just an interface in front of a data storage / persistence engine.

That doesn't mean that SQL is the most challenging language to learn and master. It just means that we have taken something special purpose (data persistence) and stuck a very general purpose interface in front of it with all of the complexity that doing so invites.


You think SQL injection vulnerabilities are due to the SQL language itself?

That's like saying JS is flawed specifically because too many people used eval(…).


Your comment misses my broader point, which is that you don't NEED a text-based programming language to interface with a persistence engine.

And so through that lens, yeah it's pretty obvious that if SQL didn't exist then neither would SQL injection vulnerabilities. That doesn't mean that an alternative interface wouldn't have its own potential attack vectors. It's just one example of the complexity invited and brought upon us by using a general purpose text-based programming language to interface with storage.


Are those languages newer than SQL? Did their designers know that SQL integration would eventually be needed? If so, then why didn't they incorporate the glue as part of their language design?

I wonder if you are directing your ire toward the wrong language(s).


To be fair, it seems to me that the main idea of the article is different and more interesting than that. SQL hate is just a collateral.


Next step: UI components a-la http://witheve.com/


I don't like UI components. I don't want to drag things on screen and put them on each other to something to happen.

Can't we have TUI instead, with fixed position menus, sane shortcuts, search bar for actions and options and such?

Just the other day I was wondering about how we have the touchscreen ticket vending machines for ~10 years now, and they are still not working when the whether is sunny or rainy. Moving the cursor with physical buttons was a much better construction.


> fixed position menus, sane shortcuts, search bar for actions and options and such?

(Some) GUIs used to be like that, merely adding pointing devices as an alternative way to operate, and for inherently graphical tasks. It doesn’t need to be restricted to TUIs.


This looks really interesting, why did they stop working on it?


They posted on HN about why, some years ago: https://news.ycombinator.com/item?id=16227130

tl;dr it was hard to monetize, experimental research not really well fitting the startup model, but at the core of it I think it lacked a compelling, concrete problem to solve. Eve was an amazing tool with a lot of potential, and an unclear story of how a person might actually want to use this tool in their life.


It's a really interesting tool. They should've focused this on the personal knowledge management crowd, I feel like they would eat that shit up for a research focused environment.


Experimental programming languages based on completely novel concepts are difficult to program, and even more difficult to disseminate and monetize.


Maybe because he got absorbed by TigerBeetle.


Another alternate is to use Excel.

The downsides being [compared to this one] :

- Excel is more bulky. Takes time to start.

- You are forced to use Excel's editor, and not any editor of your choice.


- Your programs are not portable (guess what VLOOKUP is in the Spanish version)


They are portable, you can open a file made in Spanish Excel in the English Excel and vice-versa and always see VLOOKUP in English Excel.

It's "just" that since the formula language is also localized, you'll need the Excel localization(s) you know to work on it.


I stand corrected!


This is like Jupyter notebooks for clojure with UI elements. Certainly helpful for small applications.

Jupyter seems to have community support for other languages though[1] including clojure, so it would be nice to see how that compares.

[1] https://github.com/jupyter/jupyter/wiki/Jupyter-kernels


Clojure also has Clerk, which is like Jupyter, but more befitting Clojure's overall philosophy: https://clerk.vision/


I've had a similar experience using the sam [1] text editor in command line mode, basically like a REPL. Filtering and modifying data using sam's "structural regular expressions". Rob Pike, the author of sam, specifically points out in the linked paper that the command language is great for manipulating multi-line "records".

Add the ability to { nest { expressions } } + very flexible file system IO (e.g. pipe selection to an external command and return the result), and you have a really nice recursive querying language. I specifically like it because of how easy it is to refine the query/regex substitution when it didn't yield the expected result.

1: http://doc.cat-v.org/plan_9/4th_edition/papers/sam/ (In particular, scroll to the "Structural Regular Expressions" section.)


This is interesting.

I do wonder what the ergonomics of querying data is for people who are unfamiliar with your APIs, they may prefer to query the data with SQL or a cursor based API.

The volcano model for querying in SQL databases is useful but I've never implemented that. (Kind of hard to query about though on Google)

https://www.computer.org/csdl/journal/tk/1994/01/k0120/13rRU...


So, a spreadsheet?


a better spreadsheet, at least for the person using it



you might be interested in https://youtu.be/HB5TrK7A4pI

"We Really Don't Know How to Compute!"- Gerald Sussman (2011) [1:04:18]


you could also just have a string tag field, avoiding some inference which need to be often overwritten

I use your typical boring spreadsheet and it's mostly just numbers and strings with a few sums for totals


Is there a reason to homeroll your accounting like this vs using one of the plain text accounting tools and their scripting interfaces?


Coding to learn the domain. Correcting for conceptual model mismatch between you and the tool. Possibly both at the same time.

Take Ledger: I've approached it repeatedly over the years, but I always struggled with it. Not because of double-entry accounting: that part I get. It's everything else that's confusing to me, because the tool - like every other tool in this space - seems to be written for an American audience that eats Equities, breathes Liabilities, commutes between Accounts Payable and Accounts Receivable, and do 99% of their finances through the stock market.

For a simple person from middle of nowhere in Europe, this is a completely alien mental framework. I think in terms of having dollars, paying in them, getting paid in them, occasionally borrowing them to or from someone else. Every time I read through Ledger's docs, I stumble on the same set of problems, like "wtf is this whole accounts payable / accounts receivable business", or "if this is a small-scale accounting/budgeting tool, why half of the docs are about managing stock prices?!".

So yeah, I end up doing most of my personal finances with ad-hoc Emacs Lisp in an Org Mode document, or throwaway Excel sheets, and each time I do it I understand ever so slightly more about finances. For anything non-trivial, I just mail an accountant.


Fellow European here. I'm currently transitioning my accounting from Ledger (used it for 5+ years) to PTA, a tiny Perl program by OpenBSD dev Ingo Schwarze: https://mandoc.bsd.lv/pta/

As compared to Ledger, the journal entries are much less verbose, mostly fitting on a single line: https://cvsweb.bsd.lv/~checkout~/pta/journal.example.en?rev=...

As you can see, it also utilizes traditional account numbers (listed in a separate Chart of Accounts file) instead of Ledger's Account:Subaccount form. PTA also has Debit and Credit, so all amounts are positive, whereas Ledger uses "+" and "-".

For me, the PTA approach feels much safer, since basically everybody (including the auditors) in my country are probably used to this kind of system and a numbered Chart of Accounts. Less confusion.

Interestingly, while converting my .ledger file to .pta, I discovered a lot of mistakes that seem to be linked to either the system Ledger uses -- or, rather, me not understanding it. I was also fairly inexperienced with double entry accounting when I started with Ledger. But replacing +/- with Debit/Credit accounts pointed those errors out, somehow.

Here's the author discussing PTA and comparing it to Ledger&friends. Also a great discussion on plain text accounting in general: http://www.undeadly.org/cgi?action=article;sid=2020092812343...


I absolutely hear you :)

I think "personal accounting(wrong word)" is like the note-taking-app-craze ! It's yet to be solved for the 'common-man'.

I'm talking about a 'first-practical' THEN 'correct system'. Versus the current 'First CORRECT then practical' philosophy of the current solutions (Money, Sage, Pastel etc)

Every few years I go "THIS" year I'm going to do proper finance records with one of the accounting programs like a real adult !

Then I'm soon bored to tears or frustrated by all the things I don't know about accounting (my fault not the subject's fault) or the apps

I then quickly revert to a "money-in-bank / days-left-in-month" 'accounting-system' and a fin.txt in Dropbox where I write down some things...


Mostly reason is because it is interesting.

I could use liber office calc for my accounting as well.

But for playing with code it is perfect combo for a problem I can tackle and a problem I am interested enough to solve. For accounting my own things I am also quite safe not much harm that can be done :)


One reason NOT to do it would be any legal repercussions if this is used for anything more than some kind of home budgeting. When you are reconciling the accounts, you shouldn’t miss items as described in the article.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: