The derivative of a number (2014)

GrantMoyer · on May 11, 2024

Mathematicans like to call two things with the same "structure" by the same name, even if it's not obvious how they're otherwise related. In this case, the shared structure is that both the derivative of a function and D(n) follow the product rule of derivatives. Sharing a structure means two mathematical objects are "morphic" (homomorphic[1], isomorphic[2], etc.) in some way, so in some sense they are the are same object.

In this case, we can construct a morphism. Since D(n) follows the product rule, we only need to find a function f of x for each prime p which at some x has a value of p and a derivative of 1. Then we can compose those functions by multiplication for all other natural numbers. f_p(x) = x + p is one such set of functions, giving us the complete function F_n(x) = Π_(p∈P(n)) x + p, where P(n) is the set of prime factors of n. Note, the product of the empty set is defined to be 1, so F_1(x) = 1.

Finally, the homomorphism between D and the derivatives of functions is that D(n) = F'_n(0), so in some sense, D really is a derivative.

[1]: https://en.wikipedia.org/wiki/Homomorphism

[2]: https://en.wikipedia.org/wiki/Isomorphism

supernikio2 · on May 11, 2024

I'm rather new to the concept of objects and morphisms myself, but I love how fields like category theory allow one to "zoom out" far enough in abstraction that two seemingly different concepts are actually the same applied to distinct contexts.

BalinKing · on May 11, 2024

> so in some sense they are the are same object.

My understanding is that this is only the case if they’re isomorphic—even a pair of homomorphisms between the objects is not itself sufficient to identify them as the same. But I also don’t know any category theory, so I might be spouting nonsense :-)

GrantMoyer · on May 11, 2024

Indeed, a homomorphism from A to B is an isomorphism from A to a subset of B. In other words, homomorphism is to injective as isomorphism is to bijective. So whether you say A is equivalent to B depends on how you define equivalent.

For anyone curious, while Category Theory is very concerned with morphisms, they also come up in many other places. In particular, Abstract Algebra (groups and rings and such) may be a more approachable introduction to morphisms than Category Theory, and then Category Theory flows pretty naturally from the concepts of Abstract Algebra.

smohare · on May 12, 2024

A generic homomorphism does not induce any isomorphism to a subset of B that I can think of unless it is already injective. The isomorphism you’re after is rather from the quotient structure of A mod the kernel to the image.

alexanderchr · on May 12, 2024

Simply consider any constant morphism and it is obvious that OPs statement is false. As you say one needs injectivity.

Almondsetat · on May 11, 2024

I have found the Wikipedia entry to be much clearer and interestig than this article

https://en.m.wikipedia.org/wiki/Arithmetic_derivative

tromp · on May 11, 2024

Non-mobile link: https://en.wikipedia.org/wiki/Arithmetic_derivative

keepamovin · on May 11, 2024

Yeh the link is good, and I think the article is good too, discussing some limitations / gotchas / cons. Gives me the impression maybe a more useful definition could be crafted, but also that this D function could be unexplored and yield some cool stuff :) haha!

yau8edq12i · on May 11, 2024

The article also contradicts Wikipedia, e.g., the article claims that Barbeau introduced the derivative in 1961, but Wikipedia gives several references to Shelly's definition of the derivative in 1911. You'd expect an emeritus professor to have enough time on their hands to check their sources...

samatman · on May 11, 2024

One of the nice things about Wikipedia is that you can use it as a time machine, instead of just blindly throwing about accusations.

If you look at the first page in 2014, the time of the article, you'll find this: https://en.wikipedia.org/w/index.php?title=Arithmetic_deriva...

> E. J. Barbeau was most likely the first person to formalize this definition.

This suggests to me that Shelly's earlier definition in 1911 was not generally known. Indeed, Wikipedia has been a significant driver in rescuing earlier results in mathematics from obscurity.

But you can't cite what you don't know. So I wouldn't expect an emeritus professor, or anyone, to have either a time machine or a crystal ball.

yau8edq12i · on May 12, 2024

Look, I'm a professor of mathematics at a research university. If I decided to use Wikipedia as my sole source for deciding who to credit for what, my papers would get rejected when they inevitably become wrong. "it wasn't on Wikipedia" is hardly an excuse when it comes to ethics. What is known by the average Wikipedia editor is not what is known by experts in the field.

thecodedmessage · on May 12, 2024

“Ethics” for what is almost certainly an honest mistake with a reasonable diligence of research? Accusing someone of ethics violations should require more research than you’ve done!

thecodedmessage · on May 12, 2024

Would you have found out about those earlier sources without Wikipedia? I think it really depends on how obscure they actually were. I’d take the fact that neither Wikipedia nor this article knew about these earlier sources in 2014 as evidence that they were pretty obscure.

yau8edq12i · on May 12, 2024

You don't need to reply twice to make your point.

> Would you have found out about those earlier sources without Wikipedia?

I'm not an expert in this domain. The author is.

samatman · on May 12, 2024

That should mean you know better than to accuse one of your colleagues of ethics violations on the basis of a cursory perusal of Wikipedia.

Would you do something like that using your real name?

yau8edq12i · on May 12, 2024

In an informal conversation, the equivalent of a forum like HN but with my real name? Sure, I would have no problem saying out loud and unanonymously

> "it wasn't on Wikipedia" is hardly an excuse when it comes to ethics

which is what I actually wrote, in a post where I'm not even talking about the particular situation. I expressed dismay about a failure to properly attribute ideas, and then I explained that I don't care about what was written in 2014 on Wikipedia, experts should know better anyway. If you've decided to read an accusation in what I wrote, that is absolutely on you. The most negative thing I wrote about the author was:

> You'd expect an emeritus professor to have enough time on their hands to check their sources...

And, again, I would have absolutely no problem saying that in public with my real name. I've said much "worse".

playingalong · on May 11, 2024

They likely weren't emeritus in 1961.

yau8edq12i · on May 11, 2024

I'm talking about the author of the article, Lipton. Not Barbeau who is the subject of the article itself, not its author.

lupire · on May 11, 2024

The article is 2014, when Lipton was 64. So maybe emeritus, maybe not.

xelxebar · on May 11, 2024

Comments on the OEIS sequence D(n) are insteresting: https://oeis.org/A003415. One in particular points out that if we take the prime factorization of n = p0*p1*...*pk, where the pi aren't necessarily distinct, then over the reals

           lim  (p0+h)(p1+h)...(pk+h) - p0*p1*...*pk
    D(n) = h->0 ------------------------------------ .
                               h

This motivates the D(p)=1 definition, meaning we can take D to be defined just by the Leibniz rule. (Note that D(1) = D(1*1) = 1*D(1) + D(1)*1 = 2*D(1), implying that D(1)=0, so that part of the definition is redundant.)

Others have pointed out that the definition also works for any unique factorization domain, and in the case of polynomials, the Leibniz rule guarantees D agrees with the standard derivative, which is also a nice sanity check.

galaxyLogic · on May 11, 2024

As an aside there also exists the notion of Derivative of Regular Expression which has useful applications

https://jvns.ca/blog/2016/04/24/how-regular-expressions-go-f....

sigil · on May 11, 2024

This idea was used to great effect in Matt Might’s “Parsing with Derivatives” paper [0]! And it featured prominently in the Compilers class he taught at the University of Utah.

[0] https://matt.might.net/papers/might2011derivatives.pdf

banish-m4 · on May 12, 2024

Excellent paper. I recall it goes past simple/regular languages and can be used to parse anything without first building a lexer.

galaxyLogic · on May 11, 2024

Looks very interesting. Does it mean that RegExp Derivatives can be used to parse languages which basic regular expressions cannot?

As an example basic HTML cannot (?) be parsed by RegExp because tag-pairs can contain tag-pairs:

   <div> <div> </div> </div>

eludes RegExp matching, it seems to me, because a typical standard RegExp would only match "<div> <div> </div>" and would not see the 2nd </div>.

Can RegExp Derivatives do it better?

sigil · on May 12, 2024

HTML can be described by a context-free grammar [0], but not by a regular grammar [1]. If a language can be described by a regular grammar, you can parse it with a regular expression -- that's where the "regular" in RegExp comes from!

Derivatives of RegExps don't automatically unlock parsing of context-free grammars, afaik. For that you need recursion. They do however unlock some very elegant parser designs.

[0] https://en.wikipedia.org/wiki/Context-free_grammar

[1] https://en.wikipedia.org/wiki/Regular_grammar

banish-m4 · on May 12, 2024

Read up on context sensitive grammars.

mikhailfranco · on May 19, 2024

As an aside on an aside, there is also the related classic:

Conor McBride, The Derivative of a Regular Type is its Type of One-Hole Contexts

http://strictlypositive.org/diff.pdf

related to Huet's Zipper and Hinze, Paterson Finger Trees, with a huge of follow-on literature:

https://personal.cis.strath.ac.uk/conor.mcbride/Holes.pdf

https://conal.net/blog/posts/differentiation-of-higher-order...

and numerous old posts by sigfpe (Dan Piponi), e.g.

http://blog.sigfpe.com/2008/06/blessed-mans-formula-for-hole...

ykonstant · on May 11, 2024

This is one of the many examples of geometric concepts being applied to integers (in this case, the notion of derivation, although a non-linear one).

Another important concept is that of a curve and its ring of functions in algebraic geometry; for the integers, the curve is the prime spectrum of Z, i.e. the prime ideals generated by each prime number <p>. The ring of regular functions is precisely the ring of integers, operating as functions on prime numbers by n(p) = n modulo p.

I wonder if D has any interpretation in terms of nonlinear differential operators on Spec(Z).

jesuslop · on May 11, 2024

Ahh, nice one dares to proffer this. I can plug my own wondering about if one can formulate an aritmetic derivative that instantiates the Kähler differential concept (but I'm supposed not to ask unless I already knew the answer, so what'd'be the point).

jesuslop · on May 12, 2024

related, https://amathew.wordpress.com/2010/02/17/kahler-differential...

bandrami · on May 11, 2024

Are you saying the derivative is a geometric concept? Tangent slope of a curve is simply one application of a derivative; it's not the derivative's identity. What the derivative is is the inverse of an inner product.

ykonstant · on May 11, 2024

Can you people stop with the inane pedantry? Yes, the derivative is a geometric concept and so is the inner product; they are at the core of what a Riemannian manifold is, they group to form the (co)tangent spaces of varieties and schemes and their derived structures produce the local geometric data of the object in question.

lupire · on May 11, 2024

The point is that the derivative is a more general concept than just geometric, and is naturally defined with or without a geometric comtext. Of course almost anything can be modeled geometrically. You can draw a picture of almost anything. The integers themselves are obviously geometric, by drawing a kindergarten number line.

https://en.m.wikipedia.org/wiki/Langlands_program

ykonstant · on May 11, 2024

This is all semantic nonsense that devalues the original post I made about a potential connection between two arithmetic-geometric objects. But thanks for patronizing me, a research mathematician actually working on the arithmetic Langlands program, with a wikipedia link to the fucking Langlands program.

bandrami · on May 11, 2024

Huh? Inner product only needs vectors. No geometry required.

lupire · on May 11, 2024

That's one of many definitions of derivative.

bjornsing · on May 11, 2024

This function D on the integers might be interesting, but why call it a derivative? I can’t see much conceptual similarity with calculus derivatives.

gizmo686 · on May 11, 2024

It is pretty common to talk about derivatives in abstract algebra in contexts where the calculus definition makes no sense.

Broadly, there is a class of functions refered to as "derivations" that can be viewed as a generalization of the derivative. In particular, a derivation satisfies 2 properties:

1) It is linear

2) It satisfies the product rule.

Any function that satisfies these rules is often called a "derivative".

Notably, the function discussed in the article fails the linearity test, which is a pretty big problem for calling it a derivative.

codeflo · on May 11, 2024

Also, the abstract algebra definition is an actual generalization, because the (analysis) derivative acts as a linear operator in the vector space of (suitably smooth) funtions. It blew my mind when I first encountered this way of looking at derivatives. And from there, it makes complete sense to look at operators with similar properties in other vector spaces.

The definition presented here is a loose analogy to derivatives rather than an actual generalization, which doesn't fully justify using the name IMO.

JadeNB · on May 11, 2024

> And from there, it makes complete sense to look at operators with similar properties in other vector spaces.

It does, but this one's on a rig (= ring - negatives), not a vector space over any field.

codeflo · on May 11, 2024

I don't see how you would define linearity in a ring (I don't mean a module, just a ring). I.e. D(af) = aD(f) doesn't make sense if you don't have scalar multiplication.

gizmo686 · on May 11, 2024

The minimum would be to ask for D(a+b) = D(a) + D(b).

EDIT: Actually, according to wikipedia, that is exactly what is done in differential algebra

https://en.wikipedia.org/wiki/Differential_algebra

JadeNB · on May 12, 2024

> I don't see how you would define linearity in a ring (I don't mean a module, just a ring). I.e. D(af) = aD(f) doesn't make sense if you don't have scalar multiplication.

Every ring is a module over itself. But you wouldn't want the definition you propose; instead, you'd want `D(ab) = aD(b) + D(a)b`. If you really like some sort of linearity to be present, you could observe that this property forces every derivation to be linear as a transformation of `R_0`-modules, where `R_0` is the subring `ker(D)` of "constants".

magicalhippo · on May 11, 2024

> It satisfies the product rule.

It's been many years since I had calculus at uni, and never any abstract math. Why is the product rule picked as the "interesting" attribute of derivatives, ie to serve as the basis for the generalization?

Is there some deeper connection of the product rule in ordinary derivatives that singles out the product rule over the other properties a derivative has?

For me, a key aspect of derivatives is that it allows for something like Taylor expansion or integrals to exist. Are there any equivalent things to these product-rule-generalized derivatives?

gizmo686 · on May 11, 2024

I actually think it is better to not think of derivations as a generalization of derivatives; but to think of derivatives as just one of many examples of a derivation. That the name 'derivative' comes from calculus and became the name for this abstraction is just a quirk of history. You could also consider the notion of a linear function to be a generalization of derivatives. Or even the notion of homomorphisms in general.

Having said that, there are many usages of other derivations where the calculus inspiration is clear, even if the geometric meaning that motivated the calculus is lost.

For instance, we often talk about polynomials over arbitrary fields. In general, there is no way to graph such polynomials. There is no notion of tangent lines, slope, "continuous", or even "less than". There is, however, still the notion of roots and the multiplicity of roots. These notions turn out to be quite important.

When working with any polynomial, you can define the "formal derivative" as a derivation that also satisfies D(x) = 1, D(a) = 0 (where a is an element of the underlying field). This operator behaves as you would naively expect a derivative to behave over polynomials. In Galois theory, it is important to distinguish between polynomials that have repeated roots, and those that do not. If you have a polynomial f(x), you can determine this by taking its formal derivative f'(x). Then, you can easily compute their greatest common divisor [0]. If this is a constant, then you know f has no repeated roots.

[0] Using Euclid's algorithm, this is a purely mechanical process that can be done without needing to factor either f or f'. A similar trick has actually been used to attack real word cryptography. If there are secret primes p and q, and a public number pq, many cryptosystems assume that it is infeasible to determine what p and q are. However, if there is a bad random number generator, you might get 2 different keys that share a prime, so have the public numbers pq and ps, then you can easily determine that p is a common factor, from which you can easily recover q and s. This means that you can look for a large collection of public keys and try this attack on possible pair of them.

magicalhippo · on May 11, 2024

Thanks, that makes more sense. Most of the time these generalizations make sense to me, sometimes it can be really obscure. Flipping the viewpoint makes it more clear in this case.

poizan42 · on May 11, 2024

The product rule is really the simplest rule for derivatives that apply to operations on numbers that gives something interesting.

Other basic rules would be addition:

(f(x) + g(x))' = f'(x) + g'(x)

This is just linearity (together with constant multiplication)

And function composition (the chain rule):

(f ∘ g)' = (f' ∘ g)⋅g'

We would need to somehow figure out what should correspond to function composition.

So if we want something that captures some important algebraic properties of derivatives the product rule would be a good place to look.

bjornsing · on May 11, 2024

> 1) It is linear

This ”derivative” is not linear though, and that was sort of what motivated my question.

im3w1l · on May 11, 2024

> The arithmetic derivative can also be extended to any unique factorization domain (UFD),[6] such as the Gaussian integers and the Eisenstein integers, and its associated field of fractions. If the UFD is a polynomial ring, then the arithmetic derivative is the same as the derivation over said polynomial ring

I was just skimming the wikipedia article but this seems like a good argument.

constantcrying · on May 11, 2024

Because it obeys a similar rule to derivatives on functions. The pattern of taking a concept and extending it to other things by looking only at what happens under one rule, is extremely common in mathematics. That way you can e.g. extend the definition of a derivative to non-continous functions.

tromp · on May 11, 2024

As Wikipedia puts it, the

> number derivative is a function defined for integers, based on prime factorization, by analogy with the product rule for the derivative of a function that is used in mathematical analysis.

bandrami · on May 11, 2024

I'm now curious if this can be defined on Gaussian integers...

lupire · on May 11, 2024

It's already defined, as given, for all unique (prime) factorization domains, including Gaussian integers.

rm445 · on May 11, 2024

It seems pretty confusing given that operator D already exists as notation for the usual differential calculus. (Not the most common notation, but useful sometimes). It may not be very exciting that D{n} = 0 for all numbers, but that's the actual derivative of a number.

Someone · on May 11, 2024

> It may not be very exciting that D{n} = 0 for all numbers, but that's the actual derivative of a number.

No, that’s the derivative of the function that return n whatever its argument, which also can be written as λx.n or in a zillion different ways.

That can be written as n, but is different from the number n.

Also, one man’s “pretty confusingly is another man’s “similar things should have similar names”. There’s a rich history in mathematics of overloading the meaning of terms and symbols as long as there’s some similarity between them, for example when using × for both the multiplication of numbers and of matrices (where the former is commutative, but the latter isn’t, barring some exceptions such as 1 × 1 matrices).

(See also the comment elsewhere in this thread which says “Mathematicans like to call two things with the same "structure" by the same name, even if it's not obvious how they're otherwise related” (https://news.ycombinator.com/item?id=40327885)

keepamovin · on May 11, 2024

I'm just reading about it now, but I think because the derivative power rule holds for it, and you can do partial derivatives based on primes. The linked wikipedia article covers it well in Elementary properties.

GrantMoyer · on May 11, 2024

D(n) is equal to f'(0) where f(x) = Π_(p∈P(n)) x + p and P(n) is the set of prime factors of n.

nico · on May 11, 2024

Derivative of a binary number:

Think of any binary number as a sequence of 0s and 1s in a certain order

For example, 16 in binary is the sequence: 1000

Reading the sequence from right to left, two bits at a time, for each one of those two bits, we can note if the value of the “earlier” bit in the sequence changed

I can note a change as 1 and a not change as 0, then the above sequence becomes:

0 (0-0 no change) 0 (0-0 no change) 1 (0-1 changed)

Result: 100

In decimal: 8

Now if I want to “integrate” that sequence, I can do the reverse, but now I have ambiguity, if I start with 0, the sequence would be the original:

0 0 (0 means no change) 0 (0 means no change) 1 (1 change)

Result: 1000

But if we start with 1 instead:

1 1 (0 no change) 1 (0 no change) 0 (1 change)

Result: 0111

Intuitively you can think of this as tracking a “discrete rate of change”

Usually the derivative or slope of a function gives a real value, now imagine “zooming into” the function until you can’t track a real value anymore, only whether what you are looking at is changing or not every time you look

nico · on May 11, 2024

This can also be generalized to anything that can be expressed as sequence generated by a function

You can always just split the thing into it’s sequential elements, then get a pair-wise derivative in between the elements

The sequence of pair-wise derivatives is then the equivalent of the derivative of the original sequence

If you do this to the limit where the “space” between the elements is 0, then you get the continuous case

johnthescott · on May 11, 2024

entropy of a bit string may be another way to view the bitwise derivative. paper here for analyzing primes:

      https://arxiv.org/abs/1305.0954

[BiEntropy - The Approximate Entropy of a Finite Binary String]

nico · on May 11, 2024

Thank you for the reference, these concepts are definitely closely related

This is super interesting:

> We successfully test the algorithm in the fields of Prime Number Theory (where we prove explicitly that the sequence of prime numbers is not periodic)

What we do, and what ML algorithms try to imitate, when learning, is exactly that: finding loops (periodic sequences) within the data (or rather, fitting the data to continuous “loopy” representations)

dang · on May 11, 2024

Discussed at the time:

The Derivative of a Number - https://news.ycombinator.com/item?id=8198607 - Aug 2014 (47 comments)

zvr · on May 11, 2024

Interesting... I'd imagine it's not common to have re-submissions of exact same links ten years apart, without other re-submissions in-between.

pvg · on May 11, 2024

It's not that uncommon, in fact, there's another one of the front page right now with a bigger gap:

https://news.ycombinator.com/item?id=40329173

https://news.ycombinator.com/item?id=2920379

mindcrime · on May 12, 2024

OK, I apologize in advance for being "that guy" but I feel compelled to ask: are there any practical applications of this, in the sense of applications of the normal calculus "derivative" (eg, slope / "rate of change"). That is, would I ever calculate this "derivative" to solve any kind of real world engineering problem or anything? Or does this serve solely in the realm of pure (theoretical) mathematics?

math_dandy · on May 11, 2024

This notion of derivative of a number depends very weakly on the number itself. It only depends on the multiset of exponents in the number’s factorization into a product of prime powers. In this way it’s like the divisor counting function, the omega and Omega functions, the möbius function, etc. As such, its value distribution might be interesting. Something analytic number theorists might like to play around with.

lordfrito · on May 11, 2024

I my naive mind, if you can differentiate a number than you should be able to integrate it. So what then is the integral of a number?

if I(n) is the integral of n, then shouldn't I(D(n)) == n?

But if D(prime) = 1, then what prime is the answer to I(1) ??

If this can't be done, then what were doing here isn't differentiation as I understand it. So why call this the derivative of a number? Why not call it something else?

yau8edq12i · on May 11, 2024

It is, indeed, not differentiation in the sense you're used to. Here it just means a map that satisfies the Leibniz rule, (fg)' = f' g + f g' or D(fg) = D(f) g + f D(g). Maps with this property are usually called "derivations".

Although you might also want to consider that "integration" (really, indefinite integrals, AKA antiderivative) is only defined up to a constant. So why couldn't it be the same for this "number derivative"? Perhaps the "antiderivative" is only defined up to something. It'd be a fun exercise, if you're interested. Can you figure out under what conditions do you get D(a) = D(b)? Put differently, given an integer c, what are the solutions to D(x) = c?

plank · on May 11, 2024

Solutions to D(x)=C is probably something like p_a^a * p_b^b * … * p_n^n with a+b+d+…+n (yes, I am skipping c) equal to your constant C, and p_a, p_b etc all prime.

yau8edq12i · on May 11, 2024

No, that looks wrong. Take c = 2, for example. You seem to be saying that p^2 ought to be a solution. But D(p^2) = 2p (because D(p^2) = D(p) p + p D(p) = p + p = 2p).

GrantMoyer · on May 11, 2024

Starting from my discussion in [1], you can find a sort of antiderivative of natural numbers by finding which natural number (if any) ∫F_n maps to. Note however that D is not injective and has no inverse.

[1]: https://news.ycombinator.com/item?id=40327885

rm445 · on May 11, 2024

Heh, the "integral" I(1) = 2 + C, where C is a constant from the set {0, 1, 3, 5, 9,...} (has to add up to another prime number).

NegativeLatency · on May 11, 2024

What does D look like around the origin? I was hoping for a plot or something, also pretty hard to search for.

xpe · on May 11, 2024

People may disagree about the use of "derivative" as used in the article. May the bludgeoning-by-definition continue! Alas, if one day we tire of definitional-skull-splitting...

... for some given function, we can simply recognize the difference between: (a) the function definition; (b) properties of the function.

For the (calculus) derivative: (a) means "rate of change"; (b) means the usual derivative properties e.g. the product rule and chain rule

For the arithmetic derivative [1] (or number derivative): (a) means "1 for any prime; everything else calculated via the product rule" [2]; (b) means the same as above

There are other examples of the above a/b split in mathematics. Finding examples is left as an exercise for the reader.

[1] https://oeis.org/wiki/Arithmetic_derivative

[2] Yes, the definition of (a) makes (b) obvious.

nobodyandproud · on May 11, 2024

The skull splitting is a natural reaction because this has nothing to do with differentiation.

Throw in Leibniz’s rule and the reader is reduced to reading the fine print to understand.

The articles (Wikipedia included) are as guilty of this as whoever chose this name.

eigenket · on May 11, 2024

This definition does actually have a surprising amount to do with differentiation. The definition works for any unique factorization domain and in particular for polynomials.

It turns out that the definition here exactly matches the usual derivative for polynomials.

xpe · on May 11, 2024

Are you saying that for arithmetic derivatives, the definition (part "a" above) "1 for any prime; everything else calculated via the product rule" has a surprising amount to do with differentiation?

If so, can you connect the dots?

Or did you mean the properties (part "b" above)?

eigenket · on May 11, 2024

Yes,

> 1 for any prime; everything else calculated via the product rule

does indeed have a surprising amount to do with differentiation!

If you take the usual polynomial functions in one variable (lets say x is the variable and all our things are complex numbers) then these can be factored: e.g. x^2 + 3x + 2 = (x+1)(x+2). They form a (so called) unique factorization domain, which essentially means that factorization into "primes" works exactly the same as it does for integers. In the example above (x+1) and (x+2) are examples of prime factors which can't be factored any further.

If you take the definition "1 for any prime; everything else calculated via the product rule" and apply it to this system where our "numbers" are polynomials and our "primes" are the polynomials we can't factor any further you get a definition of an "arithmetic derivative" for polynomials.

The fun fact then is that this arithmetic derivative we just defined is exactly the same as the usual definition of the derivative from calculus:

D[(x+1)(x+2)] = (x+1)D[(x+2)] + (x+2)D[(x+1)] = (x+1) + (x+2) = 2x+3

whereas

d/dx (x^2 + 3x + 2) = 2x + 3

gjm11 · on May 11, 2024

More the first than the second.

There are things other than the integers for which it makes sense to talk about "the primes". One example is: polynomials (with coefficients in, let's say, the complex numbers). In this case it turns out that the "primes" are exactly the linear polynomials (ax+b) where a is nonzero.

There's a bit of ambiguity there, just as there is in the integers; 7 and -7 are "the same prime number", and x+3 and 5x+15 are "the same prime polynomial"; if we're going to say D(p)=1 then we need to pick which "version" of p has this property, and the obvious choice is the one of the form (x+a).

So, now, if we apply the same definition as for integers to polynomials with these conventions, it says: (1) D(x+a) = 1 and (2) D(fg) = fD(g) + D(f)g when f,g are polynomials. And that turns out to give the exact same result as the "ordinary" derivative for polynomials.

Whether "exactly identical to" implies "a surprising amount to do with" depends on how easily surprised you are, I guess.

... I glossed over the sense in which the "primes" are precisely the linear polynomials, so here are a few words about that for anyone who's curious.

If we look at polynomials with complex-number coefficients, a beautiful theorem says that they can all be written as A (x-r1) (x-r2) ... (x-rk), and then one polynomial divides another if and only if its set of rj is a subset of the other's (handling repeated roots in the "obvious" way). It's pretty easy to get from this that the linear polynomials are (1) the irreducible ones, i.e., the ones that can't be factored into lower-degree polynomials, and (2) the prime ones, i.e., the ones with the property that if p divides ab then p divides either a or b. (These properties are equivalent for the integers, as well as for polynomials with complex coefficients, but there are other settings in which they come out different, and both of them are useful, so they have different names.)

(What happens if we use real rather than complex coefficients? The Wikipedia "Arithmetic derivative" page claims that we still get the usual derivative, but that looks wrong to me, because if we work over the real numbers then x^2+1 is both prime and irreducible, but its derivative isn't 1. Maybe I'm missing something.)

eigenket · on May 11, 2024

As far as your point in parentheses goes, I think wikipedia is either wrong or confusingly written (allowing complex factorisations of real polynomials makes what they're written consistent, but is a bit silly).

See theorem (20) on page 18 of this pdf for a theorem along these lines

https://cs.uwaterloo.ca/journals/JIS/VOL6/Ufnarovski/ufnarov...

xpe · on May 11, 2024

This comment demonstrates my point, does it not? It looks like more of the same: a battle of definitions. If one's goal is to win a definitional battle, what do you accomplish if you succeed? [1]

But one will not consistently win such a battle. Many people will resist for various reasons, whether it be "stubbornness" or simply feeling like the other person shows no signs of trying to understand what they mean.

I propose that better goals include: (i) understanding what people are saying; (ii) applying the concepts to some productive end. By "productive" I mean some forward progress in an empirical or mathematical sense, whether it be prediction or proof.

So give up the battle. Why? Not because you are wrong. [2] Because "being right" about a definition is rather silly. We're talking about concepts being communicated by language and symbols. The goal is shared understanding of the concepts (which happens inside a brain), not merely enforcing a mapping of brain states to ink on a page (words) or vibrations in a physical medium (sound).

[1]: Whether you win or lose, the distinction between (a: definition) and (b: properties) still exists.

[2]: And not because you are "right" either. You can, at best, be consistent in your definitions and use them in useful ways.

nobodyandproud · on May 11, 2024

I’m saying that if confusion and annoyance has been the norm, then a quick translation for the general audience is in order.

Even better would be a differentiating name, but I realize that’s unlikely.

xpe · on May 11, 2024

> The skull splitting is a natural reaction because this has nothing to do with differentiation.

I recommend rephrasing that as "the definition (part "a" above) of arithmetic derivative is different than calculus definition of derivative."

Do you see? Stating it this way reduces the war of words. Your point is made clear. [1] Then other people can say "Ok, sure, but don't you see how the properties (part "b" above) are the same? And isn't that interesting?"

Think of this another way. Imagine an alternate history where the arithmetic derivative was discovered, named, and socialized first. Then imagine calculus came along later. If so, would calculus be wrong to use the same word, "derivative"? ... I won't answer that question because it is invalid. Better to dissolve the question [2].

My point? Let's try to shift away from historical battles over turf and terminology. Let's find ways to share insight.

[1] Unless your intended point was: "how dare you use the word differently?"

[2] https://www.lesswrong.com/posts/Mc6QcrsbH5NRXbCRX/dissolving...

nobodyandproud · on May 11, 2024

I have zero stakes in this other than stating perhaps it’s not the audience that’s at fault, when an in-circle term isn’t translated for a more general audience; and the audiences are consistently confused and annoyed.

I woke up, noticed the title & blog (and references), went through the same tortured route and confusion, before stumbling on the fine print, and coming to same “oh, for crying out loud” reaction.

xpe · on May 11, 2024

Is it fair to say the definitional confusion bothered you more than the interesting aspects (I'm presuming that part "b" above is more interesting) pleased you?

If I were to guess... I'd say you (and many people, including myself, often) are weary of people redefining words in a way that seems wasteful, distortive (such as 'stealing' words that formerly had clear technical meanings), purely commercial, or self-promotional.

For me, at least, the intent of the redefinition matters. But I detect no self-interest or obvious neglect in the case of the arithmetic primes.

DerekL · on May 11, 2024

Title needs (2014).

JadeNB · on May 11, 2024

It wouldn't hurt, but why "needs?" The notion isn't changing.

tempodox · on May 11, 2024

While we're at it, why not (1961)? That's when Edward Barbeau published his paper. The link to that paper in the article is sadly 404.