A Vast New Data Set Could Supercharge the AI Hunt for Crypto Money Laundering

As a test of their resulting AI tool, the researchers checked its outputs with one cryptocurrency exchange—which the paper doesn’t name—identifying 52 suspicious chains of transactions that had all ultimately flowed into that exchange. The exchange, it turned out, had already flagged 14 of the accounts that had received those funds for suspected illicit activity, including eight it had marked as associated with money laundering or fraud, based in part on know-your-customer information it had requested from the account owners. Despite having no access to that know-your-customer data or any information about the origin of the funds, the researchers’ AI model had matched the conclusions of the exchange’s own investigators.

Correctly identifying 14 out of 52 of those customer accounts as suspicious may not sound like a high success rate, but the researchers point out that only 0.1 percent of the exchange’s accounts are flagged as potential money laundering overall. Their automated tool, they argue, had essentially reduced the hunt for suspicious accounts to more than one in four. “Going from ‘one in a thousand things we look at are going to be illicit’ to 14 out of 52 is a crazy change,” says Mark Weber, one of the paper’s coauthors and a fellow at MIT’s Media Lab. “And now the investigators are actually going to look into the remainder of those to see, wait, did we miss something?”

Elliptic says it’s already been privately using the AI model in its own work. As more evidence that the AI model is producing useful results, the researchers write that analyzing the source of funds for some suspicious transaction chains identified by the model helped them discover Bitcoin addresses controlled by a Russian dark-web market, a cryptocurrency “mixer” designed to obfuscate the trail of bitcoins on the blockchain, and a Panama-based Ponzi scheme. (Elliptic declined to identify any of those alleged criminals or services by name, telling WIRED it doesn’t identify the targets of ongoing investigations.)

Perhaps more important than the practical use of the researchers’ own AI model, however, is the potential of Elliptic’s training data, which the researchers have published on the Google-owned machine learning and data science community site Kaggle. “Elliptic could have kept this for themselves,” says MIT’s Weber. “Instead there was very much an open source ethos here of contributing something to the community that will allow everyone, even their competitors, to be better at anti-money-laundering.” Elliptic notes that the data it released is anonymized and doesn’t contain any identifiers for the owners of Bitcoin addresses or even the addresses themselves, only the structural data of the “subgraphs” of transactions it tagged with its ratings of suspicion of money laundering.

That enormous data trove will no doubt inspire and enable much more AI-focused research into bitcoin money laundering, says Stefan Savage, a computer science professor at the University of California San Diego who served as adviser to the lead author of a seminal bitcoin-tracing paper published in 2013. He argues, though, that the current tool doesn’t seem likely to revolutionize anti-money-laundering efforts in crypto in its current form, so much as serve as a proof of concept. “An analyst, I think, is going to have a hard time with a tool that’s kind of right sometimes,” Savage says. “I view this as an advance that says, ‘Hey, there’s a thing here. More people should work on this.’”

Read the full article here