Inspired by a Reddit post warning of malicious URLs relating to Crypto-Currency Exchanges Here,  I decided to make codify, a chrome extension that gives a warning when homoglyph phishing attacks are being used to help prevent online fraud to victims who could easily mistake a manipulated URL for a genuine one.
Codify- Developed by myself

These types of attacks are called homoglyph attacks basically meaning to us the URL looks normal but to a computer it's actually very different from what it appears to be.

For example http://аррӏе.com might appear to be Apple's homepage but if you click on it, you'll actually see the real address is "https://xn--80ak6aa92e.com/" (this link is safe and has been set up simply to demonstrate the attack). This is very worrying as typically one of the largest indicators of a scam taking place online is the URL not looking correct, however this attack removes this vital indicator. Most browsers are now updated to display the correct URL when the page loads, however it's still too easy to click on the URL and believe the page is genuine.

That's why I developed Codify, it's a simple Chrome add on that gives a warning when a user opens a page that either contains the PunyCode formatting or contains out of alphabet characters such as umlauts.

I've never made a chrome extension before so I had to start from basic principles and followed basic templates to get started, but from that point onward all of the JavaScript used was written, developed and tested by myself all within a 48 hour development period. All the code written is available at My github link.

 

The first major component of the add-on is that every time a page is loaded within the browser, the real URL is parsed and verified to not contain any URL manipulation (Where an URL is manipulated to appear as a link that it isn't). The function "AddListen" simply creates adds a listener that is called when any of the tabs are updated which then calls a function that checks if the tab that triggered the listener has 'completed' and 'active' i.e. it's still not loading a webpage.

The addon then checks if the "number" is set to 1, this is a stored in memory variable that checks if the add-on is set to On or Off by the user, I'll discuss the toggling functionality later.

If all of the above conditions are met then ParseUrl() is called to check that the page that has just been loaded by the browser contains a genuine URL. The last thing to note is at the very bottom "Addlisten()" is written outside of the function meaning it is ran every time the add-on is loaded into chrome, automatically adding the listener so it's ready to run each time a page is loaded.

 

This is where things get a little spaghetti-code-ish, this function is called when a new page is loaded to make sure it doesn't contain the typical URL manipulation patterns . The first few lines of code "chrome.tabs.query" queries tabs within chrome that are "active and lastfocused" i.e. the tab that has just loaded and "var tab = tabs[0]" gets the specific tab that we're interested in.

Next the first part of this function "tab.url.includes("xn--",0) checks if the URL of our tab contains "xn--", this is the basic signature of PUNYcode URL's such as the example listed earlier, there's a great post explaining PUNYcode here. If it does contain it, it sends an alert to the user to make them double check the site.

Next the add-on checks for the use of umlauts or other alphabet characters that are masquerading as English characters, sometimes a website will handle them as the characters they are, e.g. in the url they will appear as "www.xyz.com/š" or sometimes they will be represented as hexadecimal within percent encoded URL strings e.g. "www.xyz.com/%9a%" as shown here.

One thing to note however, this percent encoding method is how some websites handle form and request data and I wouldn't want an alert being sent when a user is simply browsing a legitimate site, so I have programmed the add-on to only check up until the third slash e.g. https://Google.com/ <-- this way, the add-on will verify the domain only but not send false alerts for sub-domains on legitimate pages.

The code assigns a variable "upto" to the index of the third slash within the URL (so that http:// doesn't interfere), it then runs a for loop checking every character before this third slash doesn't fall outside the standard English alphabet, this way it verifies that the domain of the site being visited is correct and not masquerading as something else but the sub-domain is ignored.

This was the easy one to program as it only checks for an ASCII value larger than 127, the next one was more difficult as it has to search for a percent encoded hex string, convert the hex into a decimal number, and check if it is greater than 127 to verify that the domain name isn't using homo-glyph trickery to attempt to scam a user.

It uses the same forloop iteration variable "upto" as before to only check the domain, it then looks for a % sign followed by a hexadecimal digit (E.g. 0123456789ABCDEF or ASCII 48 -57 & 65-70) which is programmed as:

if((tab.url.charCodeAt(x+1) >= 48 && tab.url.charCodeAt(x+1) <= 57) || tab.url.charCodeAt(x+1) >= 65 && tab.url.charCodeAt(x+1) <= 70)

(spaghetti code I know, but it works quite efficiently) The program then converts this hex-string into decimal so it can be evaluated if it's larger than 127 or not, if it is, an alert is sent to the user with the offending character used so that they can double check the URL themselves.

To prevent multiple annoying alerts being sent, I implemented a type of break statement by setting the for-loop iteration variable to it's maximum when the first alert is sent.

After programming the add-on I tested it with the 3 types of manipulation techniques likely to be used. I hoped the add-on would err on the side of slightly overcautious but not send any false alerts without any good reason.

First of all I tested it with  http://аррӏе.com from above to check for PUNYcode style URLs which produced the following result.

This produced an alert as expected and shows the actual correct URL not the "apple.com" which tricked the user into clicking it in the first place.

Next I tested the add-on with http://www.š.com to test for attacks where an attacker could replace the S of a legitimate URL with an "š", the test produced the following result.

Which shows an alert for the out of english alphabet character for the domain. I tested this with a subdomain on my own website which did not produce an alert meaning the program was working as intended and only looking at the domain and not the sub domain.

I was really happy with the functionality of the add-on as it picked up all of the test cases I threw at it and wasn't triggered by anything it shouldn't. However in the case of false positives where the user wanted to browse a domain that contained one of these characters I wanted to add the ability to easily toggle the add-on on and off.

This block of code that I followed the "color changer" example closely for simply toggles the Add-on On and off when the user clicks the extension icon, it turns the icon from green to red (on to off) and vice versa, the add-on is totally inactive when it is switched off and doesn't send any alerts.

I added one more block of code that runs when the user first installs the extension so that it is on by default so that a user doesn't download it and have to remember to turn it on.

This simply stores the value for On within memory and sets the icon to the On mode, this means that the add-on is automatically on and ready to parse URLs on first download, this way the user doesn't have to remember to turn it on at all.

Happy with the progress I made in learning how to create an extension for chrome I uploaded it to the webstore and I'm really happy with how it was received.

I hope this add-on helps prevent some people from falling victim to phishing attacks through manipulated URLs and I hope to make more useful chrome extensions like this one in the future as it was really quick and easy to learn from nothing how these add-ons are written.