When a company that I worked at was doing some testing with testnet bitcoins, I thought it would be useful to maintain an office pool based on a brainwallet, so that anyone could just use the testnet coins for experiments, and then send them back to the brainwallet. The passphrase was a dictionary word with a couple of trivial substitutions -- something like "company" => "c0mp4ny!" -- to make it easy for developers to remember for ad hoc testing.
After I sent some testnet coins to the address but before I could even send out an email telling people about it, the coins had been stolen (it might have been in the same block that my original transaction appeared in, I don't remember). This was ~3 years ago, and on testnet -- the only thing I could imagine is that someone was testing out a brainwallet-stealing bot on testnet before deploying it to mainnet. I was very impressed.
I think a compelling theory is that these addresses are generated by blockchain.info accidentally hashing uninitialised memory instead of random data.
Uninitialised memory for blockchain.info would have a very high density of bitcoin-related data, and they also generate a large amount of keys.
If they (sometimes?) end up hashing uninitialised memory instead of random data, there's every chance that they'd sometimes generate keys that are hashes of bitcoin-related data. It is admittedly unlikely that they'd be hashing data that is exactly the right length to be a bitcoin address or txid, but there could also be another bug which means they hash data that is nul-terminated rather than a fixed size.
Both fitwear[1] (mentioned in the pastebin) and another user[2] have reported BTC inexplicably being transferred out of their blockchain.info wallets.
The next question is, if this is really what's going on, how has it happened? Is it an honest bug or has it been planted deliberately (by either a hacker or an employee) with the intention of being able to steal funds from some proportion of blockchain.info addresses?
There is also every possibility that fitwear and the person who wrote the pastebin article are the same person, and that it is maliciously designed to spread FUD about blockchain.info. We shouldn't jump to any conclusions without seeing a smoking gun.
EDIT: But wait a minute! Blockchain.info keys are generated client-side in javascript, to the best of my knowledge, so this shouldn't explain it at all...
A bit of historical background might be good. Blockchain.info has had several high impacting bugs over the years that has led to their users losing funds, like biased random numbers and re-used r values. Another bug of this type shouldn't be too surprising.
Many people used to recommend against their wallet for these reasons, but on the other hand they haven't closed up shop and ran off yet, which makes them head and shoulders above the average hosted wallet provider. Bitcoin is a very special space.
I've generated addresses for all nul-terminated strings over 10 chars from a memory dump of my bitcoin node, and I'm scanning my blockchain to see if any of them have been used.
I used gdb to dump the bitcoind memory, I'm using https://github.com/matja/bitcoin-tool to output human-readable addresses, I'm using (a modified version of) https://github.com/tenthirtyone/blocktools to parse the blockchain files, and some custom scripts to extract addresses from the scriptpubkeys and compare them against my list of addresses taken from the bitcoind memory.
I extract them from bitcoind's rpc interface by dumping blocks, but i started that process years ago. It'd take several months to dump everything from scratch at this point.
I don't see how transactions made in 2016 (2014 even?) are part of malicious FUD, they exist and are verifiable.
You could assume malice, yet their hypothesis is somewhat testable: if it is uninitialised memory you should have private keys based on word-size shifts due to alignments. 2^64 hashes is somewhat doable. Truncation should be easier to test (in case it's null-terminated).
I've just re-read the pastebin more thoroughly and this stuck out to me:
> At some point between then and Nov 12, the compromised 15ZwrzrRj9x4XpnocEGbLuPakzsY2S4Mit got into his online wallet as an 'imported' address.
The 15ZwrzrRj9x4XpnocEGbLuPakzsY2S4Mit address is generated by using sha256() of his previously-imported address 1Ca15MELG5DzYpUgeXkkJ2Lt7iMa17SwAo.
The other confusing part is:
> fitwear's 15Z address sat unused until Nov 12 when fitwear transferred his 9 BTC into it using blockchain.info.
Why did he send money to a random address in his "imported" addresses list in the first place? The usual wallet workflow would surely send change to an address derived from the wallet's actual seed, not an "imported" address. And fitwear would presumably have no reason to send money to himself on purpose, and even if he did why would he choose an "imported" address instead of one derived from the wallet seed?
So what exactly was fitwear trying to do here? The more I think about it, the more I think fitwear messed this up, and it's nothing to do with generating keys from uninitialised memory.
Possibly it's a combination of malware importing addresses into his blockchain.info wallet and him doing weird transactions that ended up losing him money, or possibly it's just FUD designed to discredit blockchain.info.
> it sounds like maybe some bit of code decided "hmm, that's not a well formatted WIF private key, it must be a brainwallet" without very clearly explaining what was going on
That sounds plausible. Possibly fitwear tried to import the same private key in 2 different ways and ended up getting burnt.
The most likely explanation is blockchain.info trying to magically figure out how to import anything you supply, and the code decided to treat an address as a brainwallet since it didn't look like any known format of private key.
This illustrates a very general problem in computer security: how do you know you can trust your hardware-firmware-software stack? It's not enough to simply trust your vendor (c.f. the recent High Sierra root vulnerability). Even a vendor who says they are prioritizing security and has the technical chops to make that happen (like Apple) can make mistakes. Worse, unless they have really good internal auditing, a single dishonest employee can intentionally insert a vulnerability that can go unnoticed for a very long time (e.g. GOTO FAIL). If the stakes are high (e.g. you can walk away with a few hundred million dollars, or enable whatever intelligence agency you are loyal to to spy on hundreds of millions of people) then the incentive to insert a back door becomes equally high.
It is quite easy to create a pseudo-random number generator whose output is statistically indistinguishable from a real random number generator but is easily predictable to an attacker. (It's actually much easier to do this than to build a secure RNG.) This makes random number generators particularly attractive targets for attacks. Access to reliable random number generators is crucial for any kind of security, but is going to become increasingly challenging as time goes by.
I think I have an explanation, but I'm not sure, so I would appreciate criticism.
OP = person who posted pastebin
E = person who's been stealing and laundering funds
The key observation is that some people "create private keys from just about anything using Sha256 (i.e. Sha256(password/phrase)). This, of course, is NOT a recommended way of obtaining a private key since if YOU can think of the word/phrase, someone else can too"
Maybe E noticed this in 2014 just like OP did. Then she decided to try to make a lookup table. First she hashed everything in a natural language dictionary. Then to try to make her hash lookup table a bit bigger, extended it to any string of a given length like a transaction id, wallet public key, etc that appeared in the blockchain. This is somewhat similar to a standard cryptanalysis technique - where you try using everything in RAM as a key to see if it's the LUKS key - but using these particular inputs is admittedly ad hoc.
However, the next textbook step to attempting a lookup table is to do rainbow tables - which E started to do, but then gave up after a few iterations since the space was too large, and apparently this was enough, to have found the private keys for a significant number of other people's wallets.
Once the E had that, it's trivial to write a script that performs onward transfers to other accounts controlled she controls and steal money. It also makes sense to transfer the funds through several wallets as a laundry. It's strange that the transfers occur precisely along the links in the rainbow table (a rainbow table is really a forest of linked lists) - very strange. It's possible E just got lazy, since this is one of the easiest ways to write the script that decides where to send the money, once she controlled all these sha256-chains (chains in the rainbow table sense) already. Of course, that laundry's not good enough, since OP has now discovered it and made all of us curious to identify E.
I would appreciate any criticism of this hypothesis.
Far fetched as this is, I don't really see why OP assumes an "exploit," when good old fashioned precomputation - rainbow tables - explains the observation. Thoughts? I may be totally off.
PS. There may be records of IP addresses involved in some of these transactions. Does anyone know where to find that kind of info, to see what IPs E was running the scripts from? People using VPSs, VPNs and Tor sometimes leak their home IP.
> One of the things that fascinated me was the ability for someone to create private keys from just about anything using Sha256
I'm also fascinated by this. For my next Bitcoin Treasure Hunt [1], I've been coming up with a lot of puzzles that involve private keys. It's so much fun to think about different ways to encode them. My first puzzle wasn't too interesting, but this one should be much more fun.
I've been trying to follow along with the article in Python and nothing is working. The article says that private key "KyTxSACvHPPDWnuE9cVi86kDgs59UFyVwx2Y3LPpAs88TqEdCKvb" is "4300d94bef2ee84bd9d0781398fd96daf98e419e403adc41957fb679dfa1facd" in raw bytes, but decoding base58 and hexlifying gives me something completely different:
In [17]: hexlify(base58.b58decode("KyTxSACvHPPDWnuE9cVi86kDgs59UFyVwx2Y3LPpAs88TqEdCKvb"))
Out[17]: '804300d94bef2ee84bd9d0781398fd96daf98e419e403adc41957fb679dfa1facd012774e1c4'
Actually, it looks like it's a combination of bitcoin-check and chopping off the first and last bytes. It matches then. Thanks for the bitcoin-tool hint, looks great!
You should not publish other peoples' private keys; dumb as they may be.
While you should be commended for letting users know that they should use e.g. key strengthening (or, preferably, random numbers) for their private keys, you should not leak what are, whether by chance or intention, the private keys of other peoples' (bots'?) accounts.
There's nothing illegal about being an idiot.
It may be illegal to leak others' financial account information.
Might as well link directly to the source: https://pastebin.com/jCDFcESz which really I think should be the article URL anyway (the tweet just links here too). Very interesting find!
Looks to me like he stumbled on a bitcoin payment processing system?
That would explain how the "bot" knows a transaction is about to occur and why the bitcoins were being transferred within minutes of arriving.
The addresses he finds are temporary addresses a customer is directed to send payment, and then transferred to a more secure wallet shortly after confirmation.
Those bots still have to spend time finding addresses. If the transfer happens in seconds or minutes, its more likely the address was already known and expecting a deposit.
The addresses are generated ahead of time and placed into a huge database. Every new transaction that is broadcast is checked against the database, and if any vulnerable outputs are created they are swept immediately.
If many of these originate with some large player like an exchange or a miner, shouldn't we perhaps be able to trace it back to that place, and notify them. They might of course already be looking at it now that it has been made public. Actually, you'd think that they would have noticed their money getting lost.
Related, but on my mind: How could we ever know how many bots are transacting on the blockchain? It seems that it is ripe for manipulation by those that benefit from higher transaction numbers.
Are exchanges audited to ensure they aren't artificially inflating the price by using artificial buyers and sellers? It seems like there is every incentive to keep the price high, and it seems like doing so maliciously would be quite straightforward.
1.) Transacting on the blockchain costs fees. It would be very expensive to fake bitcoin transaction volume. Even if you're a miner, your fake volume would be displacing real volume that would be paying real fees.
2.) Transactions on exchanges never hit the blockchain, so exchange volume and blockchain volume are unrelated. Furthermore, keeping the price high would not be "quite straightforward". What method do you have in mind? The only way to artificially keep the price high long-term is to place infinitely-large buy orders at the minimum price you want. This will obviously cost you unbounded amounts of money and would be financial suicide.
Some of the exchanges have zero fees for high enough volume on some trading pairs... and if an exchange themselves are running wash trades it's not like they'd be paying their own fees anyways.
> Even if you're a miner, your fake volume would be displacing real volume that would be paying real fees.
You'd be missing out on mining revenue. And given that the marginal cost of mining tends towards the marginal reward for mining, missing out on revenue is almost as bad as making a loss.
There would have to be a substantial incentive to fake transaction volume in order to make it worth it.
Additionally, given that blocks are full these days, there's not very much scope for increasing the apparent transaction volume anyway. The best you could do is replace large "real" transactions with a larger number of smaller "fake" transactions.
What does "early miners" mean? Nobody has any inherent advantage in mining bitcoins. The costs are the same for everyone
Some people may have an advantage by having cheaper electricity costs or better mining hardware, but they would still be missing out on significant amounts of fees by not mining legit transactions
After I sent some testnet coins to the address but before I could even send out an email telling people about it, the coins had been stolen (it might have been in the same block that my original transaction appeared in, I don't remember). This was ~3 years ago, and on testnet -- the only thing I could imagine is that someone was testing out a brainwallet-stealing bot on testnet before deploying it to mainnet. I was very impressed.