URL Encoding Explained - Why Chinese Links Need Encoding

Published 2025-03-02 · ToolNest

When you see %E4%B8%AD in a URL, that's URL encoding at work. This guide explains why and how it works.

Why URL Encoding is Needed

URLs can only contain a limited set of ASCII characters:

  • Letters and numbers (A-Z a-z 0-9)
  • Reserved characters: - _ . ~ ! * ' ( ) ; : @ & = + $ , / ? # [ ]

Any other character — including Chinese, spaces, and many symbols — must be percent-encoded.

How It Works

Each byte of the character is represented as % followed by two hexadecimal digits:

  • Chinese character "中" → UTF-8 bytes E4 B8 AD%E4%B8%AD
  • Space → %20 (or + in form data)
  • &%26

Two Key Functions in JavaScript

Function What It Encodes Use Case
encodeURI Keeps ;/?:@&=+$# Encode an entire URL
encodeURIComponent Encodes everything except A-Z a-z 0-9 - _ . ~ Encode parameter values

Example

// Encoding a parameter value
const url = "https://example.com/search?q=" + encodeURIComponent("中文 & test");
// Result: https://example.com/search?q=%E4%B8%AD%E6%96%87%20%26%20test

// Encoding an entire URL
encodeURI("https://example.com/path with space")
// Result: https://example.com/path%20with%20space

Rule of thumb: Use encodeURI for full URLs, encodeURIComponent for parameter values.

Common Pitfalls

1. Double Encoding

encodeURIComponent(encodeURIComponent("中文"))
// Encoding twice — decoding once gives %E4%B8%AD instead of 中文

2. Space Representation

  • %20: Standard URL encoding
  • +: Only in application/x-www-form-urlencoded (form submissions)

Use %20 for URLs.

3. Hash Fragment Cutoff

The # character separates the URL from its fragment. Anything after # is not sent to the server. If your data contains #, it must be encoded as %23.

Modern Approach: URLSearchParams

const params = new URLSearchParams({ q: "中文", page: "1" });
const url = `https://example.com/search?${params}`;
// URLSearchParams handles encoding automatically

Use ToolNest URL Encode/Decode to quickly encode and decode URLs.

← Back to Articles