URL Encoding Explained - Why Chinese Links Need Encoding
When you see %E4%B8%AD in a URL, that's URL encoding at work. This guide explains why and how it works.
Why URL Encoding is Needed
URLs can only contain a limited set of ASCII characters:
- Letters and numbers (
A-Z a-z 0-9) - Reserved characters:
- _ . ~ ! * ' ( ) ; : @ & = + $ , / ? # [ ]
Any other character — including Chinese, spaces, and many symbols — must be percent-encoded.
How It Works
Each byte of the character is represented as % followed by two hexadecimal digits:
- Chinese character "中" → UTF-8 bytes
E4 B8 AD→%E4%B8%AD - Space →
%20(or+in form data) &→%26
Two Key Functions in JavaScript
| Function | What It Encodes | Use Case |
|---|---|---|
encodeURI |
Keeps ;/?:@&=+$# |
Encode an entire URL |
encodeURIComponent |
Encodes everything except A-Z a-z 0-9 - _ . ~ |
Encode parameter values |
Example
// Encoding a parameter value
const url = "https://example.com/search?q=" + encodeURIComponent("中文 & test");
// Result: https://example.com/search?q=%E4%B8%AD%E6%96%87%20%26%20test
// Encoding an entire URL
encodeURI("https://example.com/path with space")
// Result: https://example.com/path%20with%20space
Rule of thumb: Use encodeURI for full URLs, encodeURIComponent for parameter values.
Common Pitfalls
1. Double Encoding
encodeURIComponent(encodeURIComponent("中文"))
// Encoding twice — decoding once gives %E4%B8%AD instead of 中文
2. Space Representation
%20: Standard URL encoding+: Only inapplication/x-www-form-urlencoded(form submissions)
Use %20 for URLs.
3. Hash Fragment Cutoff
The # character separates the URL from its fragment. Anything after # is not sent to the server. If your data contains #, it must be encoded as %23.
Modern Approach: URLSearchParams
const params = new URLSearchParams({ q: "中文", page: "1" });
const url = `https://example.com/search?${params}`;
// URLSearchParams handles encoding automatically
Use ToolNest URL Encode/Decode to quickly encode and decode URLs.