URL Encoding: The Most Boring Yet Essential Web Standard
One year, we launched a campaign page with a URL like this:
https://example.com/event/春节特惠
Everything worked in the test environment. After going live, users reported: "The link gives me a 404."
After hours of debugging, we found the issue: WeChat's in-app browser mangled the Chinese characters. The server received garbage instead of the expected path.
Since then, I never put non-ASCII characters directly in URLs.
Why Can't URLs Have Chinese Characters
This goes back to 1994.
When Tim Berners-Lee invented the Web, the assumption was that everyone spoke English. The URL specification (RFC 1738) explicitly states: URLs can only contain A-Z, a-z, 0-9, and a handful of special characters like -._~.
So what about Chinese, Japanese, Korean, Arabic, and every other language?
The answer is "encoding." Convert any character outside the allowed set into %XX format, where XX is the hexadecimal representation of that byte.
Space becomes %20. The Chinese word "你好" becomes %E4%BD%A0%E5%A5%BD.
This is called Percent-Encoding, also known as URL Encoding.
encodeURI vs encodeURIComponent
JavaScript provides two functions for URL encoding. Many developers confuse them.
encodeURI: Encodes a complete URL but preserves structural characters (/, ?, &, =, etc.)
encodeURIComponent: Encodes everything, including /, ?, &, =
When to use which?
- Encoding an entire URL: use
encodeURI - Encoding a query parameter value: use
encodeURIComponent
Example:
const searchTerm = 'a=1&b=2';
// Wrong: the & in the value is treated as a parameter separator
const badUrl = `https://example.com/search?q=${encodeURI(searchTerm)}`;
// Result: https://example.com/search?q=a=1&b=2
// Right: the value is fully encoded
const goodUrl = `https://example.com/search?q=${encodeURIComponent(searchTerm)}`;
// Result: https://example.com/search?q=a%3D1%26b%3D2
Horror Stories From Production
Story 1: Double Encoding
Frontend encoded once. Backend framework auto-encoded again. %20 became %2520. Server received %20 instead of a space.
Solution: Figure out who is responsible for encoding. Don't do it twice.
Story 2: Encoding Mismatch
Legacy system used GBK. New system used UTF-8. Same Chinese character, different bytes. GBK encodes "中" as %D6%D0, UTF-8 as %E4%B8%AD. They don't match. Chaos ensured.
Solution: Standardize on UTF-8. It's 2025.
Story 3: Plus Sign vs Space
In application/x-www-form-urlencoded format, space can be encoded as +. But in URL paths, + is just +, not a space.
Many developers have been burned by this. Form data gets pasted into a URL, spaces become plus signs, and the parsing breaks.
Why You Need a Dedicated Tool
Theoretically, you can do all this encoding with a single line of code. But in practice:
- When debugging an API, you want to quickly see what a parameter looks like after encoding
- You receive an encoded URL and want to see the original
- When troubleshooting, you need to confirm how many times something was encoded
Opening the browser console every time? Too much friction.
That's what our URL encoder/decoder tool solves. It supports both encodeURIComponent and encodeURI modes, handles multi-line text, and lets you switch between encoding and decoding with a click.
No rocket science. Just making a daily task smoother.
Ready to try it yourself?
Put what you've learned into practice with our free online tool.
Try URL Encoder/Decoder