BaseToolbox Logo

BaseToolbox

Blog

© 2025 BaseToolbox. All rights reserved.

Privacy PolicyAboutContact Us

URL Encoding: The Most Boring Yet Essential Web Standard

Published on December 19, 2025

One year, we launched a campaign page with a URL like this:

https://example.com/event/春节特惠

Everything worked in the test environment. After going live, users reported: "The link gives me a 404."

After hours of debugging, we found the issue: WeChat's in-app browser mangled the Chinese characters. The server received garbage instead of the expected path.

Since then, I never put non-ASCII characters directly in URLs.

Why Can't URLs Have Chinese Characters

This goes back to 1994.

When Tim Berners-Lee invented the Web, the assumption was that everyone spoke English. The URL specification (RFC 1738) explicitly states: URLs can only contain A-Z, a-z, 0-9, and a handful of special characters like -._~.

So what about Chinese, Japanese, Korean, Arabic, and every other language?

The answer is "encoding." Convert any character outside the allowed set into %XX format, where XX is the hexadecimal representation of that byte.

Space becomes %20. The Chinese word "你好" becomes %E4%BD%A0%E5%A5%BD.

This is called Percent-Encoding, also known as URL Encoding.

encodeURI vs encodeURIComponent

JavaScript provides two functions for URL encoding. Many developers confuse them.

encodeURI: Encodes a complete URL but preserves structural characters (/, ?, &, =, etc.)

encodeURIComponent: Encodes everything, including /, ?, &, =

When to use which?

  • Encoding an entire URL: use encodeURI
  • Encoding a query parameter value: use encodeURIComponent

Example:

const searchTerm = 'a=1&b=2';

// Wrong: the & in the value is treated as a parameter separator
const badUrl = `https://example.com/search?q=${encodeURI(searchTerm)}`;
// Result: https://example.com/search?q=a=1&b=2

// Right: the value is fully encoded
const goodUrl = `https://example.com/search?q=${encodeURIComponent(searchTerm)}`;
// Result: https://example.com/search?q=a%3D1%26b%3D2

Horror Stories From Production

Story 1: Double Encoding

Frontend encoded once. Backend framework auto-encoded again. %20 became %2520. Server received %20 instead of a space.

Solution: Figure out who is responsible for encoding. Don't do it twice.

Story 2: Encoding Mismatch

Legacy system used GBK. New system used UTF-8. Same Chinese character, different bytes. GBK encodes "中" as %D6%D0, UTF-8 as %E4%B8%AD. They don't match. Chaos ensured.

Solution: Standardize on UTF-8. It's 2025.

Story 3: Plus Sign vs Space

In application/x-www-form-urlencoded format, space can be encoded as +. But in URL paths, + is just +, not a space.

Many developers have been burned by this. Form data gets pasted into a URL, spaces become plus signs, and the parsing breaks.

Why You Need a Dedicated Tool

Theoretically, you can do all this encoding with a single line of code. But in practice:

  • When debugging an API, you want to quickly see what a parameter looks like after encoding
  • You receive an encoded URL and want to see the original
  • When troubleshooting, you need to confirm how many times something was encoded

Opening the browser console every time? Too much friction.

That's what our URL encoder/decoder tool solves. It supports both encodeURIComponent and encodeURI modes, handles multi-line text, and lets you switch between encoding and decoding with a click.

No rocket science. Just making a daily task smoother.

Ready to try it yourself?

Put what you've learned into practice with our free online tool.

Try URL Encoder/Decoder