Popular Open-Source Project Moq Criticized For Quietly Collecting Data (bleepingcomputer.com) 30
An anonymous reader quotes a report from BleepingComputer: Open source project Moq (pronounced "Mock") has drawn sharp criticism for quietly including a controversial dependency in its latest release. Distributed on the NuGet software registry, Moq sees over 100,000 downloads on any given day, and has been downloaded over 476 million times over the course of its lifetime. [...] Last week, one of Moq's owners, Daniel Cazzulino (kzu), who also maintains the SponsorLink project, added SponsorLink to Moq versions 4.20.0 and above. This move sent shock waves across the open source ecosystem largely for two reasons -- while Cazzulino has every right to change his project Moq, he did not notify the user base prior to bundling the dependency, and SponsorLink DLLs contain obfuscated code, making it is hard to reverse engineer, and not quite "open source."
"It seems that starting from version 4.20, SponsorLink is included," Germany-based software developer Georg Dangl reported referring to Moq's 4.20.0 release. "This is a closed-source project, provided as a DLL with obfuscated code, which seems to at least scan local data (git config?) and sends the hashed email of the current developer to a cloud service." The scanning capability is part of the .NET analyzer tool that runs during the build process, and is hard to disable, warns Dangl. "I can understand the reasoning behind it, but this is honestly pretty scary from a privacy standpoint."
SponsorLink describes itself as a means to integrate GitHub Sponsors into your libraries so that "users can be properly linked to their sponsorship to unlock features or simply get the recognition they deserve for supporting your project." GitHub user Mike (d0pare) decompiled the DLLs, and shared a rough reconstruction of the source code. The library, according to the analyst, "spawns external git process to get your email." It then calculates a SHA-256 hash of the email addresses and sends it to SponsorLink's CDN: hxxps://cdn.devlooped[.]com/sponsorlink. "Honestly Microsoft should blacklist this package working with the NuGet providers," writes Austin-based developer Travis Taylor. "The author can't be trusted. This was an incredibly stupid move that's just created a ton of work for lots of people." Following the backlash, Cazzulino updated the SponsorLink project's README with a lengthy "Privacy Considerations" section that clarifies that no actual email addresses, just their hashes, are being collected.
"It seems that starting from version 4.20, SponsorLink is included," Germany-based software developer Georg Dangl reported referring to Moq's 4.20.0 release. "This is a closed-source project, provided as a DLL with obfuscated code, which seems to at least scan local data (git config?) and sends the hashed email of the current developer to a cloud service." The scanning capability is part of the .NET analyzer tool that runs during the build process, and is hard to disable, warns Dangl. "I can understand the reasoning behind it, but this is honestly pretty scary from a privacy standpoint."
SponsorLink describes itself as a means to integrate GitHub Sponsors into your libraries so that "users can be properly linked to their sponsorship to unlock features or simply get the recognition they deserve for supporting your project." GitHub user Mike (d0pare) decompiled the DLLs, and shared a rough reconstruction of the source code. The library, according to the analyst, "spawns external git process to get your email." It then calculates a SHA-256 hash of the email addresses and sends it to SponsorLink's CDN: hxxps://cdn.devlooped[.]com/sponsorlink. "Honestly Microsoft should blacklist this package working with the NuGet providers," writes Austin-based developer Travis Taylor. "The author can't be trusted. This was an incredibly stupid move that's just created a ton of work for lots of people." Following the backlash, Cazzulino updated the SponsorLink project's README with a lengthy "Privacy Considerations" section that clarifies that no actual email addresses, just their hashes, are being collected.
The data we collected was worthless we swear! (Score:4, Insightful)
Are they trying to distract the lawyers with math! It might actually work.
look a hash see it's meaningless without the rest of the breakfast like eggs and bread.
Re:The data we collected was worthless we swear! (Score:3)
So why are you collecting it if it's worthless?
he really isn't collecting anything, just trying to identify people who use his software to invite a sponsorship relation.
it probably seemed like a good idea in his head, and i don't think he had any dishonest intentions, but the whole thing is just an incredibly stupid occurrence with an asinine implementation, aggravated in his case by the method of delivery: not only isn't it opt-in, it doesn't even warn users of its existence and just silently includes binary code that gets randomly invoked in the user's ide. it's just a monumental clusterfuck, beyond stupid.
he developed a library that is extensively used by .net developers, but they are apparently not in the mood to donate. i can understand his frustration, but i'm surprised he thought that this could work, and that he thinks he's going to get away with just a disclaimer in the README after this fuck up, i would have reverted that crap post haste and thrown an immediate apology.
Re:The data we collected was worthless we swear! (Score:0)
he really isn't collecting anything, just trying to identify people who use his software to invite a sponsorship relation.
So what does that mean in English? He wants to persuade them (or someone else?) to financially support his project?
Re:The data we collected was worthless we swear! (Score:4, Insightful)
If they collected email hashes they can
a) Count unique emails of downloaders. No idea what the purpose is, but it does not need to be benign.
b) Given a list of emails, they can check whether the they downloaded the code. Hence the anonymity is very limited as many people publish their email addresses. For some purposes this may be almost as good as directly collecting the email addresses.
Anyways, a piece of closed-source code has no place in OSS and could always do bad things or contain malware.
Collecting email hashes is collecting emails. (Score:4, Insightful)
The collection of email hashes is equivalent to collecting emails.
There is only one use of an email hash that is to compare it to another list of emails hashes to correlate it. Inevitably this produces an email address.
Hashing is not encryption. It's obfuscation. When you obfuscate a small piece of data you are simply creating a data index into another data set. The hash acts like a key from an index.
email addresses are not protected by hashing them. It's a lie to call this a privacy step.
Re:Collecting email hashes is collecting emails. (Score:2)
Hashing can provide solid protection, but it takes some thought to do it. For example, if someone hashes email addresses with MD5 or SHA256, one can still guess a ton of email addresses by brute force. Without it, all a hash does is replace the email address with an "id" which can be looked up or computed, just as the parent mentions.
To actually protect the email addresses requires:
* A computationally expensive hashing algorithm like bcrypt or yescrypt so that brute forcing takes a lot of time.
* A salt. This way, a rainbow table can't be applied.
* A pepper, which is kept secure, and is used as the HMAC (salt + password) key. This ensures that the hashes are absolutely worthless without having the HMAC key. If this is all public code, this can't be done.
Without these, might as well skip all the smoke and mirrors and just send up the email address.
Re:Collecting email hashes is collecting emails. (Score:2)
"Pepper" is very rarely used because, as you state correctly, it does not work unless the pepper is kept secret. That means you need to distribute and protect a shared secret key and that is an overall bad option with tons of security problems. In fact, I have started studying crypto about 35 years ago and kept current and this only the second time I hear about "pepper".
Re:Collecting email hashes is collecting emails. (Score:2)
I remember talking about pepper in the 90's. It has never come up in a work context in the last 25 years for me. So I'm with you there.
But is it even needed? No. The overheads of managing that secret is massive overkill when it's only used for a hash. The secrets management of the pepper means you have to have a whole shared cryptography secrets capability in place. Well shared secrets that is.
But back to this train wreck on privacy claims. Again A simple hash does not protect a email address. It's just too easy to obtain a real email address from that. No brute force reverse hash needed. Simply down load a list of email addresses and hash those and compare.
I've worked on many projects and many many times I have had to step in a prevent data collection and storage. Almost every time it's. "We grab this just in case we will need it in the future. We have no sinister plans for it." And they mean it. They genuinely think it's OK. Simple because they are convinced they and the team will always use it ethically.
The fact is the data is almost always used unethically with in a very short period of time. And people don't even realise that's what they are doing.
Re:Collecting email hashes is collecting emails. (Score:2)
Indeed. And that is the core reason why the GDPR forbids all collection of personally identifiable unless it is specifically allowed with a valid and current business reason. Anything else just does not work to protect people. If the data is collected, sooner or later it _will_ be misused or leak.
Re:Collecting email hashes is collecting emails. (Score:3)
There is only one use of an email hash that is to compare it to another list of emails hashes to correlate it. Inevitably this produces an email address.
Not quite. The second purpose is counting how many unique emails are in there. But yes, email hashes are very likely easy to reverse. Most people publish their email addresses in some way. Also, since the code is closed, it could well send other stuff and verifying it does not is not trivial.
Re:Collecting email hashes is collecting emails. (Score:0)
Most people publish their email addresses in some way.
Mine are published at haveibeenpwned.com.
Re:Collecting email hashes is collecting emails. (Score:1)
Look at the bright side (Score:3)
This is a perfect example of why open source is great: the author chose the dark side, but the proverbial enough eyeballs spotted his shenanigans and we all get to be rightfully livid about it.
That's open source in action my friend! It works exactly as it should.
Re:Look at the bright side (Score:2)
Re:Look at the bright side (Score:2)
More correctly this is installing a garage door opener from a dirtbag who gives neighborhood hoodlums a universal remote that can open any of his bum product.
Re:Look at the bright side (Score:2)
Not quite. The ensuing shit-storm has preventative qualities for others that were thinking about doing the same crap.
Re:Look at the bright side (Score:3)
While I don't condone this data grab I think its a symptom of one of the problems with the current era of open source - there is very little giving back on the part of companies making money while leveraging a lot of projects.
It was raised several years ago after some security vulnerabilities that there are a lot of critical or high usage projects which are maintained by one person in their spare time.
Re:Look at the bright side (Score:3)
There is virtually zero giving back. The entire OpenSSL libraries, Log4J, and GnuPG were/wre all maintained by a skeleton crew of volunteers. With how important their security is, properly funding projects like this, should be something that not just companies do, but governments, because even though "security has no ROI", having critical F/OSS code fail will cost them tons of money when it does happen.
Re:Look at the bright side (Score:1)
You're talking about open source projects, code published by its creators under free licenses.
"Giving back" was never part of the deal. That's a different kind of software, and for all its funding it has never had a better security track record than FOSS.
Re:Look at the bright side (Score:3)
Indeed. And that is a problem. I think we need some general funding of FOSS to prevent critical stuff being maintained under untenable conditions. GnuPG, fortunately, is now solidly financed by donations because the author spoke up, but this approach is not enough. There are some efforts in the EU to address this problem, but I think much more is needed. The amount of money this needs is probably laughable compared to the amount of value created, but there needs to be some systematic financing of FOSS maintainers. There also should be something like financing for independent security reviews for security-relevant FOSS in addition.
Re:Look at the bright side (Score:3)
How should a company approach this? A small company might use Debian. And Apache/Nginx, and MariaDB/MySQL, and OpenSSL, and a whole bunch of GNU tools, and a scripting language (Perl/Python) and a compiler and subversion/git and a thousand other things like Grub and fail2ban and who knows what else.
It's just not reasonable to identify a thousand different projects and donate a tiny amount of money to each one. So how can these projects be supported? And what should a company like this do?
Re:Look at the bright side (Score:2)
Re:Look at the bright side (Score:3)
the current era of open source - there is very little giving back on the part of companies making money while leveraging a lot of projects.
This is hardly new. People who develop, maintain or package open source project are usually paid peanut - if anything - while big tech profits immensely from their unpaid work.
Just look at Android... This massive dystopian piece of Google spyware is literally built on the backs of unpaid idealists. It's truly sad...
Hubris (Score:2)
I appreciate people are pissed BUT this is opensource and free. The original developer can do what he likes with it.
Countless 1000s of downloads but I bet nary a dollar ever gets back to him.
Re:Hubris (Score:5, Insightful)
I appreciate people are pissed BUT this is opensource and free. The original developer can do what he likes with it.
indeed, he is absolutely free to create a time wasting and confidence compromising nuisance that alienates a sizable part of his userbase. good for him. if that's a smart way to get more sponsors and more dollars back to him is a different question.
in most shops i have worked this would have meant instantly freezing the dependency at the prior version, and the creation of backlogs to choose an alternative mock library and convert any test code to it. a provider that silently slips a shady dll into your dependency graph can simply not be trusted, period. no, not even by .net shops.
Re:Hubris (Score:-1)
I appreciate people are pissed BUT this is opensource and free.
Indeed. One of the great things about open source is that the project can be forked.
The original developer can do what he likes with it.
Agreed. Including developing a bit of software that nobody uses because he pulled some shenanigans.
Countless 1000s of downloads but I bet nary a dollar ever gets back to him.
Pure speculation based on zero evidence. Feelings and "hunches" are not evidence of ANYTHING. That statement is... asinine. It's based on NOTHING but your desire to speak.
Re:Hubris (Score:2)
> I appreciate people are pissed BUT this is opensource and free. The original developer can do what he likes with it.
What do you mean free? The dev has found a way to take payment without telling you.
On China? (Score:0)
This move sent shock waves across the open source (Score:2)
> This move sent shock waves across the open source ecosystem...
No, it really didn't. This action is actually one of the more benign stupid moves some open source projects have made. Probably the biggest news is that it wasn't some node.js weenie, it was a dotnet project instead.
We've had the same sorts of moves inject honest-to-goodness malware into people's projects. Collecting some email addresses sounds pretty tame in comparison. Either way though, this should be impetus to:
1) Version lock all your software dependencies so that sudden changes like this don't hurt you
2) Look to replace libraries and tools that pull these kinds of stunts.
What the hell is moq? (Score:0)
"I've never heard of Moq" Tom said derisively.