Skip to main content

URL Encoding Explained (Percent-Encoding for Developers)

· 5 min read

URLs can only safely contain a limited set of ASCII characters. Everything else, spaces, accented letters, ampersands inside a value, must be percent-encoded: replaced by a percent sign followed by the hex code of each byte. A space becomes %20, an at sign becomes %40, and so on. Get this wrong and links break, query parameters get truncated, or worse, you open an injection hole. Here is what actually matters.

Reserved versus unreserved characters

The URL spec splits characters into groups. Unreserved characters, the letters A to Z, digits, and the four marks hyphen, period, underscore, and tilde, never need encoding. Reserved characters have structural meaning: the slash separates path segments, the question mark begins the query, the ampersand separates parameters, the equals sign assigns a value, and the hash starts a fragment.

The key insight is that reserved characters are only special in certain positions. An ampersand between query parameters is structure. An ampersand inside a parameter value is data and must be encoded as %26, or the parser will think a new parameter has started.

encodeURIComponent versus encodeURI

JavaScript gives you two functions and choosing the wrong one is the most common URL bug.

  • encodeURI is for an entire URL. It leaves the structural characters alone because it assumes they are doing their job: slashes, the question mark, ampersands, and the hash all pass through untouched.
  • encodeURIComponent is for a single piece you are inserting into a URL, such as one query parameter value. It encodes the structural characters too, because inside a value they are just data.

The rule: when you build a query string, encode each value with encodeURIComponent, then join them with literal ampersands and equals signs. If you encode the whole assembled string with encodeURI instead, an ampersand inside a value will survive and corrupt your parameters.

Query strings and the plus sign

In the query portion, a space has historically been encoded as a plus sign by HTML form submission, while %20 is the general-purpose encoding. Both appear in the wild. When you decode, remember that a plus sign in a query string usually means a space, but a plus sign in a path does not. This asymmetry trips people up constantly.

Double-encoding bugs

Double encoding happens when already-encoded text is encoded again. The percent sign that begins an escape is itself encoded to %25, so %20 becomes %2520. Now the receiving side decodes once and gets %20 as literal text instead of a space. This shows up when a value passes through several layers, a frontend, a proxy, a backend, each helpfully encoding it. The cure is to encode exactly once, at the point where you assemble the URL, and decode exactly once where you consume it.

Verify by hand

When a URL behaves strangely, do not guess. Drop the suspect string into URL Encode to see precisely what the safe form looks like, or paste a mangled value into URL Decode to reveal whether it has been encoded once, twice, or not at all. Seeing the before and after side by side makes double-encoding obvious instantly.

Both tools run entirely in your browser. URLs frequently embed session tokens, signed parameters, and API keys, so encoding them locally rather than on a remote server keeps those secrets off the wire.

Encode once, choose the right function for whether you are handling a whole URL or a single value, and most percent-encoding headaches simply disappear.