At one point in time I believe there was also a discussion on why somebody was removing it from their site. I'm having a hard time finding it currently and remember it being for excessive contacts from people using automated tools to scan their site.
> Create a security.txt15 file at the “/.well-known/” path16 of the agency’s primary .gov domain. This file must include the Policy and Contact fields, as specified in the Internet-Draft.17
I know Google, Facebook, and Github now have this, though Apple, Microsoft, Amazon, Stripe, and Square do not (to rattle off the other tech companies with huge security teams.)
But I don't really get it. Why is this a good idea?
As 'DyslexicAtheist related, if you put this page on your site, people will crawl for it and find you, and then kick off horrible automated scanners that generate bogus bugs on every site they check, and then submit bounty requests for them.
Meanwhile: what serious tester would be impeded by not having this page to refer to, if you have a /security page on your main site, which you have to have anyways for this to work? Recall that most of the fields in security.txt are themselves just URL pointers.
I also think the RFC itself is kind of funny. Do not rely on security.txt as authorization to test a site! it says, as if an RFC really had the authority to establish that, rather than an expensive court case in which both sides of the argument will have competing claims about whether testing was allowed or (the US default) not.
I agree with the concern about bogus reports from automated scanners -- Tarsnap gets a lot of those (run against the website, even though Tarsnap's bug bounties explicitly exclude the website!).
I suspect that if this is useful at all, it's not for serious testers but rather for people who stumble across issues by accident. There are times when I've had to phone a number in whois to report a problem, because all of the contacts on a company's website went to customer support or marketing -- and there was one occasion when I didn't report something I noticed, because the whois was anonymized.
Bogus reports are an issue security.txt or not. The single and only thing the file provides is a trivially discoverable way to get the contact/submission information for security issues, should they matter for whoever posted the file. That's it. Why make people search for 10min when they can be at a standardized location.
The real reason you can’t throw it out right away is because if somebody reports a security issue to you, no matter how bullshit it is, you can’t just outright dismiss it at the risk of a snarky blog coming out attacking your security integrity. Even if the blog is complete nonsense, not everybody who reads it will know that, so the risk to reputation is too high. Bug bounties are almost completely full of incredibly low-effort scanner spam. A company can easily end up wasting huge amounts of its precious security budget getting report spammers to leave them alone. You absolutely never want to do anything to attract those kinds of people to your business, unless you have very carefully weighted up the costs and benefits first.
I honestly think most of the effort that goes into bug bounties and things like security.txt is just busy work for security charlatans who don’t know any better way to be spending their time.
If I'm understanding correctly, when "automated scanners" is mentioned here, what's meant is tools that someone runs directly, which scan the software/device and generate loads of red flags. Having a security.txt file makes it that much easier for the tool (or the user) to report the issue, valid or not.
How long does it take you to find the security contact for the coreboot project (www.coreboot.org)? I guess it's on the order of seconds.
And yet we had some interaction on the main mailing list with a journalist covering security fluff (there wasn't much substance) just recently claiming that they "didn't find the contact address" (see https://mail.coreboot.org/hyperkitty/list/coreboot@coreboot....)
So yes, one could say that the journalist should have been more diligent in their search and yet, security is one of the issues where I'd rather hear about things too often than miss something.
If that standard gains traction, I'd expect even a half-competent person in the field to know to look for the two URL patterns (or to use a tool that does it for them) which eliminates the question of "were we clear enough on how to reach out?"
"chance of someone occasionally being less wrong on the internet" for "certainty of steady stream of bogus reports" seems like a pretty bad trade off, let alone one that requires a special standard to achieve.
What's really rough about the steady stream of bogus reports is that they're injecting noise into a channel you really need to read, because amidst all the garbage CAA and Clickjacking and cookie handling and CSP header and Firefox 2.x open redirect nonsense are occasional serious reports of things like XSS, and the important reports aren't any better written than the bad reports so you have to read everything kind of carefully.
My experience has been that you get the decent reports regardless of whether you even have a /security URL, but if you open a formal bounty program you instantly and permanently ratchet up the noise, because this is a way for people in lower-income countries to make money in their bedrooms with a laptop and without much thought. You shitcan the bullshit CAA record reports, but other companies without people who are good at triage on staff don't; they'll shell out a couple bucks for them, even though they're garbage.
> Will adding an email address expose me to spam bots?
>
> The email value is an optional field. If you are worried about spam, you can set a URI as the value and link to your security policy.
It's not spam in the c1alis sense; these are people motivated enough to actually direct a crappy scanner at your site. They'll take the extra minute to go find the email address on the web page you link to.
"Do not rely on security.txt as authorization to test a site!", I was actually thinking of putting up a security.txt recently to do exactly this and explicitly define the terms upon which it would be ok to have people probe my site without retribution. Arguably, this might actually be the most valuable thing to include in a security.txt. I can't really see any down side (can anyone think of any?) as this would help attract attention from the most noble white hats who are likely to help you and are eager to sharpen their skills. Black hats won't care what you put in it. You'd probably want a lawyer to help you flesh out decent terms, but it would be a great playground to have a list of sites that invite people to pen test them under clear terms for mutual benefit.
Having a bug bounty program is generally a delicate dance between letting people too loose at your infrastructure and catching any serious bugs. security.txt or not that's something that probably has to be written clearly or it's "all rights reserved" (don't try to hack us and if you do find something then be very cautious).
As an additional though that just occured to me, we haven't seen how authorization to probe systems (bug bounty system) would interact with GDPR, e.g a breach happens due to a "ethical" hacker, have we?
Interestingly this page started showing up in my server logs recently. I was actually thinking about putting one up but after reading your post I should probably think twice.
If you had a machine readable definition for what is allowed within a password, wouldn't it make more sense to put that close to the password entry itself (e.g. on the input element/page)?
Having a site-wide definition seems like it would require a lot of mechanics to target specifc pages or properties that may have different rules (e.g. consumer Vs. corporate logins).
I still hope companies start following NIST guidelines and stop having password rules entirely.
Wouldn't this just make password crackers easier? If there's a Regex of what passwords are okay, it lowers the search space.
If knowing the rules for acceptable passwords makes it significantly easier to brute force passwords, that sounds like more of an argument to not have those rules in the first place since it wouldn't take an attacker long to figure them out himself even if they aren't published. Hiding the password policy is a very weak form of security through obscurity.
I guess it would make it easier to programmatically determine which websites have insecure passoword policies (like an alphanumeric passsword no more than 8 characters long), but the problem here is the password policy, not publishing the rules.
You're probably right theoretically, but people tend to choose really weak password if they can choose anything.
Maybe passwords should always just be auto-generated and people should be told to write them down in a... actually, nevermind that. Passwords should be a thing that's integrated in your browser/computer experience... this is something that can and should be handled by computers. You should only ever have to log into your computer and be secure from then on.
This whole insanity of point-to-point invention of secrets needs to die.
> Passwords should be a thing that's integrated in your browser/computer experience... this is something that can and should be handled by computers.
I had this crazy idea, whereby computers could themselves come up with very long, random sequences of bits.
They would then use these openers (couldn't think of a better word) to authenticate to each other, secured by mathematical operations, in place of passwords.
Sadly, I've never seen it used on websites, so it must not be a good idea. /s
Certificate auth for http with TLS and Kerberos auth for http is specified, supported by all (major desktop) browsers.
However: The UX on the browser side is shitty as hell. Certificates display weird nagscreens, without being able to specify proper defaults like "use this cert for that site and don't bother me again". Certificate enrollment has been broken (not that the form element was ever great) by all major browsers, to be replaced by "do something in Javascript maybe, if we get around to implementing a new API some time". Oh, and no logout...
Kerberos needs a parameter at browser start or an about:config setting, is incompatible with using multiple TGTs let alone automatically selecting the right one or gasp getting a new TGT for the user from the proper KDC. The only thing that kinda works mostly is using the standard company login TGT. Oh, and logging out doesn't work...
Oh, and of course most mobile systems are broken or just unsupported.
The sorry state of browser auth is 100% on browser vendors dragging their feet on those problems that have been known for around 20 years or so. And no, webauth won't save us, it'll just be another shitshow most likely.
I always felt like one of the UX blockers for key exchange was the assumption that people couldn't be expected to learn the basics, because it's too complicated.
And every attempt to ignore or hide it has just made the entire thing more complicated or confusing.
You can shoot yourself in the foot with a gun or crash your car into a wall, and yet most people manage not to do so on a daily basis.
> You're probably right theoretically, but people tend to choose really weak password if they can choose anything.
Not in my experience. If I can choose anything I'll write a decent-length phrase. If I'm forced to use numbers and symbols I'll make the shortest thing I can.
Certainly there's no reason to reject a 20+ letter all-lowercase password. Apply your capital/number/symbol rules to passwords shorter than 16 characters if you must.
I would say in general, no. For a long time, American Express had something along the lines of `^[A-z0-9]{6,8}$`. Given a candidate string, evaluating even this extremely basic regex will take more time than comparing it against a hash table. I suppose you could flip it around and use it as a generator for an arbitrary dictionary, but then you'll run into memory limitations first.
Plus, there's always the fact that an adversary can just, you know, go find the password requirements on the website and generate a matching regex themselves.
Of course, if the regex is something like `^password$`...
If you have the regex, you'd use it to generate candidate strings. Then, of course you don't have to check any more.
For a more complex regex, you could resort to a simplified version (eg, with larger search space). But the space of all random strings is way too huge to just generate random bitstrings and hope they match the regex.
> Wouldn't this just make password crackers easier? If there's a Regex of what passwords are okay, it lowers the search space.
In practice, this shouldn't make things easier for password crackers, because trying to crack a password by enumerating the password space is not a normal approach. (Except for rainbow tables.)
What you'd expect a password cracker to do is construct passwords according to a model of what kinds of passwords humans actually create (regardless of the formal password requirements), and guess those. You're not trying to make sure you've covered everything -- you're just trying to make high-probability guesses before you start making low-probability guesses.
This isn't such a bad thing. If this standard were ubiquitous, Troy Hunt could maintain a top-1000 list of easiest-to-brute-force websites according to their password-requirement declarations. One might present one's own top ranking to one's CTO to help start a conversation about eliminating voodoo security practices.
Hopefully the Top-1000 page would include an intro why the list should be empty.
for most public websites, you can reverse engineer the rules pretty easily by trying to create accounts. this can be automated pretty easily.
in any case, the only reasonable rules for passwords are probably a length requirement and possibly requiring numbers and/or symbols. knowing that a password must be at least 8 chars and include at least one number and a symbol does not reduce the search space by much.
Any half competent brute force is going to do the simpler patterns first anyway. Knowing the exact set of special characters allowed won't make it significantly faster.
It's a lot like telling someone the exact length of your password. Okay, by knowing that it's 11 characters they can skip testing 1-10... but that only lets them skip 2% of the calculations.
Regular expression engines are notoriously buggy. You definitely wouldn't want to pass random expressions to libpcre.
Web browsers are already running random expressions from the internet, but their regex implementations have proven the rule by being fodder from exploit writers. In any event, requiring a password manager to pull the engine from Firefox or Chrome wouldn't be cool. They're more likely to use libpcre or re2, but see above.
The NAPTR DNS record type uses regular expressions for domain rewriting: https://en.wikipedia.org/wiki/NAPTR_record
Not many DNS libraries support that record type as it adds alot of problematic bloat.
libpcre is exactly the implementation I’m not talking about, because it supports too much. I think re2 doesn’t do backtracking, so it’s probably a better choice.
The number of DFA states can explode exponentially. It's a real problem, especially if you're compiling untrusted expressions off the internet.
Evaluation is linear, but you're just evaluating your own password. In this scenario, it may make sense to just use a simple backtracking implementation, though as you alluded to early there aren't many simple ones around.
But it's not executing a match, it's creating a random password picker. That's exactly equivalent to parsing the pattern where the actions on terminals are to generate random characters.
> I also think there should be a similar file for the regular expression that expresses a site’s allowable passwords.
I disagree. Password "complexity" requirements need to die and instead every site's password regex should be `.{12,64}` and then verify it's not in the dictionary or a list of leaked passwords.
The current thinking among the security community is that sites should allow all passwords that aren't found on a wordlist, so that would be one hell of a regex.
An interesting attack for any website that allows customers to choose their own string for use in the url would be to choose the name ".well-known" - I'm sure I've seen sites that take user input and create a folder based on that name. Similar to reddit.com/r/.well-known except as a top-level folder but I can't find a good example right now.
Once that happens, they could put up a malicious security.txt file and get free security reports sent to an email of their choice.
But as you hint, r/.well-known does NOT in fact exist and you can't make it
The /.well-known/ paths were reserved by RFC specifically because this doesn't really happen. You could _make_ a site with the defect you imagine on purpose and perhaps somewhere in the vast galaxy of sites there's already one that can be abused to do this, but they're vanishingly rare. To the point where I don't know of one even though I went looking.
One reason this prefix would be expected to be especially rare is that dot directories are both illegal in Windows (though the NT kernel has no problem with them) and special in Unix systems (they're legal but they're treated differently because it was convenient) and so any developer or administrator working with file paths for their web server is likely to disable names starting with dot at about the same time they disable names starting with slash and other shenanigans which blow stuff up.
Now of course a URL doesn't have to reflect a path on disk, but as soon as it doesn't we're talking about a slightly more clued in developer, maybe someone who has heard about security and thinks it might be a good idea.
And for anyone who doesn't know why .files are treated differently:
In every Unix folder there are two special files, named '.' and '..' - the first is a link to the current folder, the second to the parent.
It was annoying to see these two files listed every time you wanted to see what was in a folder (using ls) so `ls` learnt to not show these files by default. The way that was coded was (erroneously) to not show any file whose name started with a '.'
People eventually realised that this had the unintended side effect of hiding files that start with '.' and found it useful, so the bug became a feature.
> dot directories are both illegal in Windows (though the NT kernel has no problem with them)
No they're not. They're mildly annoying at most. Explorer wouldn't let you directly name a file or folder ".foo", but you have always been able to type ".foo." and it would save the name as ".foo". You could also use the command prompt, or any non-explorer interface.
And they removed that annoyance entirely in Windows 10 version 1903.
I think this is nice but having this in and a lot of other meta data in a machine readable form would be nicer. You could also think of things like licenses a and copyrights.
Technically, the problem this specification solves is making it easier to find security related metadata for projects that typically have this information already but just not an easy to find place. So, the next logical step would be making this machine readable so IDEs, project hosting websites, and other tools, can do the right things instead of having to parse files intended for humans with no consistently used syntax.
Was expecting something like a yaml or json file to be a good standard for this. It's nearly 2020, we should not be improvising text file formats in standards anymore. .txt screams intended for humans.
You're thinking too short term. .txt far predates, and will far outlast, any standard du jour. I bet you wouldn't be so gung ho about security.xml.
I really don't see the issue. robots.txt works just fine, and this is simply following in its highly successful footsteps. What you suggest is change for change's sake.
Where does that go? To whom? What do they do with it?
In a lot of companies, any generic name email address probably goes to the trash, some group like customer service, an auto-responder that says "Sorry no" and sometimes to marketing. None of these help you. On rare occasion webmaster@ might get to someone who is useful and technical, but it might also be in a folder with thousands of other junk messages.
A better guess would be security@ . But still, wouldn't it just be easier to have a standard place to lookup who or where to go to contact someone for security issues? oh yea... security.txt
Both "webmaster" and "security" are standardized in RFC 2142, so one should be able to contact those easily. It's indeed not what happens in practice most of the time, but not sure if throwing more standards at it is a good solution: if/when most of the online services will follow those, they'd probably follow different ones.
For people discussing automated tools and noise, prediction: we will start seeing automated tools flagging "missing security.txt" as some sort of vulnerability and I will end up implementing it just to stop the noise. It feels like a lose-lose situation.
This seems like a really poorly thought out proposal.
From the draft's [1] intro, which provides motivation for the proposal:
> When security vulnerabilities are discovered by independent security researchers, they often lack the channels to report them properly [...]
Occam's Razor: they lack the channel to report those vulnerabilities because the company doesn't give a damn about security, not because of some hitherto unsolved technical difficulty.
Companies that give a damn about security already have either a "catch-all" contact for security related things displayed on their contacts page, or have dedicated pages detailing what the hell you are supposed to do to report a vulnerability. Companies that don't give a damn will continue to not give a damn and will ignore your document, and will continue to suck at security until someone starts punishing them - monetarily - for such appalling behavior.
But let's leave that aside for a moment. Researchers spot vulnerabilities. Need a way to report them. So what's the solution?
Apparently, a text file with only one mandatory field...
> 3.5.3. Contact
> This directive indicates an address that researchers should use for reporting security vulnerabilities. The value MAY be an email address, a phone number and/or a web page with contact information.
...which contains... er... a catch-all contact for security related things and or a link to a contacts page detailing exactly what the hell you are supposed to do to report a vulnerability? sigh...
But the real kicker is that that the standard doesn't even require that the contact information is up to date or valid:
> 6.2. Incorrect or Stale Information
> [...] Organizations SHOULD ensure that information in this file and any referenced resources such as web pages, email addresses and telephone numbers are kept current, are accessible, controlled by the organization, and are kept secure.
In case the meaning of "SHOULD" is in question see RFC 2119 [2], but basically, they "strongly recommend" that you keep your contact information up to day - instead of, I dunno, REQUIRING it, since having a channel to properly report security vulnerabilities is the ENTIRE POINT of this exercise?
/facepalm
Are we really supposed to take this seriously? This is simply security theater.
Original HN thread: https://news.ycombinator.com/item?id=15416198