WannaCry: Dry your tears, but stay vigilant!
The recent WannaCry pandemic mostly affected workstations within businesses, but servers running outdated versions of Windows were not immune either. What action can a host take when it is up against an attack like this? There is no easy answer to this question. Responsibility essentially lies with the users: they alone are in a position to keep their systems up to date. Nevertheless, we at OVH made every effort to limit the WannaCry infection on vulnerable servers among our customers, and then to contain this particularly virulent epidemic. This outbreak provides an opportunity to remind everyone of a few important computer hygiene measures to fight against this type of threat, which is not likely to go away any time soon.
If you haven’t heard about the ransomware WannaCrypt0r in recent days, you must have been living on another planet (that or too distracted by politics on the news). Exploiting a Windows file-sharing vulnerability, the attack swept around the globe. The media jumped on the story, with reports focusing on the problems experienced by companies like FedEx in the US or Renault in France, and the struggles of public hospitals in the UK, where operations had to be delayed because access to patient files was blocked. How did OVH detect and treat this major ransomware attack? Let’s just say our security team didn’t get much sleep last week.
Dubbed ‘WannaCry’ (doubtless referring to how it makes its victims feel), the attack infected somewhere between 150,000 and 300,000 devices worldwide. That figure is unfortunately still rising: the epidemic is not completely over. There are even (and we’ll come back to this later) serious reasons to fear a fresh outbreak of infections, with new WannaCry variants cropping up over the past few days. Contamination remains relatively marginal at OVH. Only 5000 IP addresses have been affected out of the two million + that we manage. These latest estimations are based on data from our network of honeypots: OVH owned devices managed by the SOC team that are deliberately exposed on the network to attract all the different types of threat that abound on the web (malware, hackers, phishing…), to get an idea of how dangerous they are and how fast they spread. Although the number of infected devices within the OVH network seems low in relation to the size of our server park, we rallied to prevent new devices from being contaminated. (To be clear, we are talking here about devices supplied by OVH to our customers, which have become vulnerable because the people responsible for updating them did not do so. WannaCry did not affect the OVH information system servers; we will talk about that in more detail later).
WannaCry: the high cost of success
Before we get started, let’s remind ourselves what WannaCry is. Ransomware is a type of malicious software (malware) that takes its victim’s data hostage, often using encryption, and demands payment for its safe recovery. Most of the time the ransom is to be paid in bitcoins to make it harder to trace the flow of money. This process nothing new. A highly lucrative enterprise, it has been booming since 2005. The FBI estimates the value of the ransomware market at a little over 830 million dollars in 2016. To give you an idea, this was the size of the SVOD market in the UK in 2015… Or the equivalent to Uber losses in the 3rd quarter of 2016 – we’ll let you choose the comparison that grabs you most, they both did the trick for us.
It is important to note that an earlier variant of WannaCry had been seen prior to this recent epidemic. Of course, the designers of the software didn’t publish a release note, but to briefly sum up: the initial version of the malware, which spread through malicious emails, did not have the success its creators were counting on. The more effective WannaCry 2.0 is based on exploiting a Windows file sharing (SMB) vulnerability. This vulnerability, which was patched by Microsoft on 14 March 2017 with the MS17-010 update, allows the code to execute arbitrarily on any remote device that publicly exposes the Windows file sharing service. This is how outdated devices were infected, mostly running Windows Server 2008 and older versions of the Microsoft operating system. It is worth noting that onward from Windows Server 2012 and Windows 10 the firewall is configured so that this service is not exposed by default, which explains why they saw fewer victims. To follow the rest of the story, it is important to understand that the file sharing service is exposed on TCP port 445.
There are several different types of malware. We have already discussed ransomware, which takes the user hostage by encrypting his or her data. But in the case of WannaCry, the ransomware is spread by exploiting a vulnerability. This is what worms do: exploit a vulnerability. This makes WannaCry a hybrid creation, part worm and part ransomware. If WannaCry is the high cost of success, the success is of course that of Microsoft, whose systems have flooded companies worldwide.
Flashback: Inglorius Blaster
Do you remember the Blaster worm? Exploiting a vulnerability that Microsoft had not patched yet after several weeks, this worm infected several hundred thousand computers running Windows 2000 and Windows XP in August 2003. Forcing the computer to reboot after 60 seconds, Blaster then scanned the Internet looking for other computers to contaminate, spreading, like WannaCry, via P2P (Peer 2 Peer) infection.
When our honeypots started to buzz
On Friday, May 12 a few tweets from @MalwareHunterTeam were already circulating about a new threat called WannaCry that was spreading fast. The first press articles appeared that evening, with information on how it exploits a vulnerability previously patched by Microsoft. Then, at around 9 pm, our technical support team in Canada was surprised by an unusual number of tickets due to ransomware.
As a cloud service provider, we fight against the proliferation of ransomware and malware to the best of our ability. This is a long-term battle, discretely waged by the SOC (Security Operations Center) team here at OVH. Complex investigations are involved when it comes to tracing the criminal networks that are behind these attacks, so we can report them to the authorities to get them shut down. On a day-to-day basis, however, prevention is the only real way to eliminate the threat of malware attacks, by reducing the primary risk: the human factor. As IT managers know all too well, it is essential to update the computer park, even though this can be a bit of a nightmare. Applications can be incompatible with the new version of an OS, for example. Therefore, it does not come as too much of a surprise to see so many devices still running Windows 2008 and older. At the same time, users must continually be made aware of another potential source of contamination: the email attachments that arrive in their inbox. (This might mean carrying out test campaigns, like at OVH, with fake phishing emails to gage the vulnerability of employees so that the less tech-savvy can be trained on better computer hygiene). Finally, there is one simple yet wonderfully effective way to protect yourself against ransomware attacks (along with most other IT catastrophes that affect data): making backups of your sensitive data!
All this to say that having to cancel date night on a Friday because of a ransomware attack is a pretty rare occurrence. But this one was rather special. We knew that WannaCry was spreading via EternalBlue, exploiting a vulnerability through attacking TCP port 445. But you probably didn’t already know that TCP port 445, which is used for Windows file sharing (CIFS), has been filtered by the border routers on the OVH network for many years for security reasons (1). TCP port 445 (SMB) had already proven itself to be a vector for spreading malware, and OVH felt that it should only be exposed within a private network. If a user wishes to communicate via this port remotely, they must do so via a VPN (Virtual Private Network). This means it is impossible to connect to a device hosted by OVH via TCP port 445, except from an IP belonging to OVH. (The filter does not work inside the OVH network because customers may use file sharing for legitimate reasons). All this meant that devices in the OVH network should have been immune to contamination…
And yet during the night from Friday to Saturday, May 13, our honeypots alerted us to a sudden rise in the number of scans from IPs inside the OVH network. We immediately got several parallel investigations underway: a methodical search for patient zero, to find out how WannaCry managed to penetrate the OVH network to spread the infection; reverse engineering the ransomware to understand how it works; introducing measures to prevent other vulnerable devices from being contaminated; and of course, scrupulously checking the devices in our internal system.
How does WannaCry work and why did it affect companies more than individuals?
In working to reverse engineer WannaCry, our priority was to identify how the worm spreads and the strategy it uses to search the Internet for vulnerable devices to contaminate. We saw that WannaCry initiates then executes two scanning methods concurrently (multithreading). The first method consists of scanning the local network to infect other connected devices. To avoid detection by the intrusion detection system (IDS), the code does not exceed the limit of ten simultaneous intrusion attempts. This precaution seems to indicate that internal networks within companies were targeted.
The second method scans the Internet by randomly creating a valid IPv4, generating the four bytes that make up the address one by one and eliminating IPs starting with 127 or by a byte >=224. Interestingly, when a vulnerable IP is detected, it seems that the worm attempts to scan the entire /24 (corresponding to 253 neighboring IPs). Our observations suggest that infected devices are capable of scanning around 30 IPs/second.
Another interesting fact: the timeout delay incorporated into the attempted compromise. One technique used by security experts to slow down the spread of this type of malware is to purposely monopolise the connection to prevent one of the threads from scanning (it must wait for the server to respond). This makes it statistically possible to shut down the threads and stop the scanning. This technique is called ‘tarpit’. In the case of WannaCry, it seems that the designer of the worm took the trouble to insert a timeout delay of one hour before disconnecting if the server doesn’t respond. Another element revealed by examining the code: compromised devices are also infected via DoublePulsar, a backdoor tool possibly used by Equation Group to inject malware into the target device. Looking at the WannaCry code, this backdoor tool seems to be reused to re-infect a device later.
What do you need to remember about the way WannaCry works? First, that the code makes companies its privileged vectors of transmission. Certain choices in the algorithms make us think this. In general, computer parks within companies are relatively homogenous. Thus, vulnerable devices are likely to be found very quickly on a local network. Scanning the whole Internet is statistically less efficient. That said, although infection is less widespread among servers than among workstations in companies, it is still critical in terms of the spread of the ransomware. A server has more resources and a higher bandwidth, which makes it much better at searching the Internet for vulnerable devices.
Charity begins at home…
We must make sure our own house in order before we can help solve a problem. Before hoping to save the world from WannaCry, we had to check the devices in our internal system to make sure none had been compromised.
We of course have a very strict internal policy on updating computers, which consists of installing patches and updates on every one of our devices as soon as they become available. All the same, no one is ever entirely protected from vulnerabilities. Computer security demands good practice but it also calls for a good dose of humility: never think you are untouchable. We were not so worried about a device in our internal system being infected (few run Windows, and the ones that do are very closely monitored). We were more concerned about finding a VM used for a test that had not been properly disposed of. If this happened the problem would not be data encryption, because these servers do not usually contain any valuable content, but rather the DoublePulsar backdoor tool that would make it possible for the ransomware to get into our systems, combined with the potential of the server to infect other vulnerable devices by scanning TCP 445 ports that were open inside the OVH network. There has been much talk about WannaCry in the media, but let’s not forget that it is not the only malware to exploit these vulnerabilities.
The typical profile of infected and vulnerable devices
Having found nothing of significance in our internal devices, we used the data gathered by our network of honeypots to put together a typical profile of servers that are susceptible to being infected by WannaCry. We detected several types of servers running vulnerable Windows operating systems: VPSs (commercialised directly or via resellers), dedicated servers, and Private Cloud VMs running Windows 2008 or older.
This led us to review the images supplied by OVH to preinstall/reinstall operating systems. We realised that certain images did not include the security patch by Microsoft that corrects the vulnerability exploited by EternalBlue. Of course, the Windows images that we supply are configured to carry out automatic updates from Microsoft. In the case of Windows Server 2008, however, the port is exposed at OS startup, which means that the server can be infected before it even has time to update itself. We had to contain this risk by patching the Windows images right away. Given the very low speed on the Microsoft Update platform, we quickly realised that we were not the only ones wanting to get our hands on this patch…
Tasks explaining how we updated the Windows images were published in real time (2), and we decided to create a robot so that security patches for Windows templates would be automatically updated and integrated in the future. We listed the IP addresses of all the devices that attempted to compromise our honeypots, and suspended the servers in question right after warning the customers concerned. This process is still underway. We considered temporarily shutting down exchanges made on TCP port 445 across the entire OVH network, but it was too radical a measure that risked having an impact on customers who use Windows file sharing legitimately and had applied the security patch that Microsoft made available two months ago. We decided instead to shut down the service for users whose server had been compromised, sending them a message to let them know (IP blocked or server in rescue mode). The message read as follows:
“Your server was involved in a ransomware cyberattack. We had to reboot it in rescue mode to stop the attack from spreading. If you use a fail-over IP, this IP has been blocked.
The ‘good news’ is that none of the infected servers managed to contaminate the rest of the Internet outside the OVH network. The filtering applied by OVH on TCP port 445 prevented the worm from receiving the response from its potential victim (the SYN-ACK could not be received, because it is dropped in the border, which means the TCP connection could not be established).
Looking for patient zero
How could TCP port 445 scans of OVH hosted devices happen inside our network when we had blocked the entry port in our backbone? This is the question we quickly asked ourselves. Our first reflex of course was to check the rules on the routers. To no avail: the filtering was working properly. We then worked backwards, examining the events recorded in the NetFlows from our anti-DDoS (VAC) protection system to identify the IP that introduced WannaCry into our network. We had to go back several days to find the guilty party. Or rather the guilty parties, because the source turned out to be several contaminated PCs connected to the Internet via xDSL access supplied by OVH. We saw that several IPs with xDSL accesses independently started to scan the Internet for vulnerable devices for several hours on the morning of Friday, May 12. As shown by the graphic below, these IPs started to infect VPSs, then dedicated servers. It is easy to see that an xDSL connection does not have the same strike force as a dedicated server, notably in terms of bandwidth. As soon as the first dedicated server was infected we saw an explosion in the number of scans launched by infected devices: the chain reaction had begun.
Predictably enough, VPNs were impacted next. By creating a secure tunnel between the exterior and an OVH device, VPNs logically allowed the network entry rules to be bypassed (including the blocking of TCP port 445). The servers hosting VPN services started to emit very heavy traffic, IPs generally being shared for this type of service. Our auto detection service for scanning ports was triggered frequently, removing affected devices from the network (reboot in rescue mode).
Although the filtering rules at the entry point to our network did not have a 100% success rate against this particularly virulent epidemic, these rules did allow us to slow down its spread, giving us more time to roll out measures that enabled us to contain the outbreak by mid-afternoon on Saturday May 13.
The kill switch: malware’s ultimate trick to avoid detection by security teams
You have doubtless read the story of the security expert who explains how he brought the proliferation of WannaCry to a standstill by registering a specific domain name that managed to stop the ransomware from spreading.
Many people wondered about the ‘kill switch’, an emergency mechanism contained within the code of a piece of ransomware that stops it from spreading. Many malwares contain this type of ‘emergency button’. Despite appearances, this ‘kill switch’ was not meant to neutralise the ransomware. It is a defense mechanism designed to make it harder for security teams to analyse it. Once a sample of the ransomware has been captured, experts can execute the file within a sandbox (a secured environment, isolated from the network) where they can focus on reverse engineering the code and finding out how it works. This is where the kill switch comes in. By carrying out a network call, which consists for example in checking that a domain (generally generated randomly) does not exist, ransomware can detect whether it is executed inside a sandbox. If it is, the call will result in a positive response faked by the sandbox, which is isolated from the network and must prepare for other network calls which will be crucial in the analysis of the sample. If it knows that it is executed in a sandbox and therefore scrutinised by a range of profiling tools, the ransomware will not be activated, preventing our tools from mapping how it works. As a side note, if you install VMware Tools (generally present on virtual devices) on your computers, certain types of malware might leave you alone. Many of them use this as a detection marker.
In the case of WannaCry, the domain used as the kill switch mechanism had not been generated randomly. Therefore, the spread of the epidemic was slowed when the domain was registered. A fresh outbreak could easily occur if the domain is released into the wild in the future, or, as we are seeing now, if WannaCry variants appear with a similar code but a slightly different kill switch mechanism.
You’ve been contaminated? Don’t pay up!
Paying essentially means feeding criminal networks, and as such giving criminal groups the means to develop ever more destructive malware. Even if the creators have a ‘reputation’ for returning data after payment ((as they boast in the reminder message that they have recently sent to their victims, not numerous enough to have paid up for their liking), it is important to realise that there are much less scrupulous individuals out there who won’t think twice about jumping on the bandwagon to make some easy money. Some of them have already forged the WannaCryt0r interface, to spread a ransomware that demands payment even though victims’ files haven’t even been encrypted. Watch out for the con artists!
If your device has been contaminated, re-installing the latest version of your Windows system and the security patches supplied by Microsoft is the best way to contain WannaCry. Some tools have been released to help decrypt your data, if you are lucky. Particularly notable is a French project called WannaKiwi, that inspects your device’s memory to find the key. Kaspersky Lab has created a platform that more broadly lists different examples of software that could help you recover when you fall prey to a ransomware attack.
You can also go to the nearest police station to file a complaint and/or get more information on the penal response that could be considered. Lastly, and we can’t repeat it often enough: don’t forget to make backup copies of your sensitive data.
During this attack, many people reported websites being down. If this happened to your site, we recommend you change every one of your passwords, no exceptions. Why? Even if they were encrypted, your files were available for download by anyone. So, if your business or your data is worth it, what’s stopping someone from having a go at trying to decrypt your "config.php" and obtaining privileged access to your database? What are you waiting for?!
The threat has not gone away. Use the media coverage of this epidemic to make IT security a top priority within your company!
As you can imagine, even though the media will soon stop talking about WannaCry, the threat remains ever present. The graph above shows a fresh surge of attempts by compromised devices to compromise others over the past few days. This is because WannaCry variants have appeared and continue to contaminate vulnerable devices. Not only that, but WannaCry is not the first to exploit this vulnerability in Windows systems. Long before it came along, Adylkuzz was already exploiting these vulnerabilities to compromise devices and mine Monero, a Bitcoin-alternative cryptocurrency. Numerous malwares are currently circulating that use the EternalBlue exploit, but also its close relatives: EternalSynergy, EternalChampion & EternalRomance. A botnet using the same vulnerability to spread then launch large DDoS attacks like MIRAI last September is more than likely to appear in the hours to come.
Don’t forget that when your device has been compromised, the DoublePulsar backdoor has been installed on it. Even if the Microsoft MS17-010 patch is applied, your device remains vulnerable on the Internet. There are many tools available to find out whether your device has been compromised, like this one from the creator of the network mapper tool ‘nmap’.
Here at OVH we know that it is utopic to believe that the threat will go away in the days to come. This is a threat that is here for the long-term. There will still be vulnerable devices to compromise on the Internet, as we have seen with the NTP service vulnerability that was revealed in 2013 and yet is still used to carry out denial of service attacks to this day.
Once you have patched every device in your network, do remember that every cloud has a silver lining. The media coverage WannaCry has received has opened everyone’s eyes as to why it is necessary to invest more in IT security. It is a critical issue. Capital within a company is no longer just a question of property assets, human resources, or even reputation. Data, and by extension the information system, are now just as important when it comes to assets within a company. Right now, you have the attention of your colleagues and your company’s executive committee. Make the most of it, before they lose interest (until the next epidemic comes along, that is).
(1) Octave Klaba, the founder and CEO of OVH, spoke about how TCP port 445 is blocked by the OVH border network for security reasons in a message on the email@example.com mailing list on June 27 2016. He explained that this had been in place for 11 years and always would be. The update that Microsoft subsequently made available for its operating system goes in the same direction, because it means that this port, which was designed for file sharing within a local network, would no longer be exposed by default.
(2) Published tasks: