Unicode is a way to encode the things that humans use to write stuff into a computer.
ASCII is for example another way, as is EBCDIC.
All these methods translate squiggles that we’ve used for centuries into something that can be represented inside a computer.
For example, the letter “A” is under ASCII represented by the number 65.
This post is pointing out that there are two characters that look identical, but have different numbers, which means that what the user sees is identical, but what the computer sees is different.
This fact is actively used for phishing, as you can craft domains looking nearly identical to the original one, but leading to your IP address hosting the phishing mask.
One of my favorites was using Japanese full stop (U+3002) in place of periods in a bare IP or anywhere you would use a period in a FQDN (fully qualified domain name). Only tested in Chrome at the time, but the browser would “correct” it for you and take you to the intended page.
Wow!
This seems to be further evidence that the process for assigning UTF entities has been thoroughly corrupted.
You can (apparently) copy/paste this on mobile:
“;” (Greek question mark)
“;” (Semicolon)
You can even render it in HTML:
; ;
And it’s included on Wikipedia, because of course it is:
Because I’m not sure what my mobile client will actually do with this comment, here’s the link to the HTML entity I used:
Also there’s plenty of other character joy to be had:
If I don’t understand what’s happening here but want to, should I research Unicode in general or something else?
Unicode is a way to encode the things that humans use to write stuff into a computer.
ASCII is for example another way, as is EBCDIC.
All these methods translate squiggles that we’ve used for centuries into something that can be represented inside a computer.
For example, the letter “A” is under ASCII represented by the number 65.
This post is pointing out that there are two characters that look identical, but have different numbers, which means that what the user sees is identical, but what the computer sees is different.
This is the basis for much tomfoolery.
This fact is actively used for phishing, as you can craft domains looking nearly identical to the original one, but leading to your IP address hosting the phishing mask.
One of my favorites was using Japanese full stop (U+3002) in place of periods in a bare IP or anywhere you would use a period in a FQDN (fully qualified domain name). Only tested in Chrome at the time, but the browser would “correct” it for you and take you to the intended page.