How geocoding works ? A simple guide

(last updated) François Andrieux

What is geocoding and why it matters?

Every day, we interact with addresses and places; whether we're searching for a nearby coffee shop on our phone, or entering our delivery address at checkout. But here's the thing: computers don't naturally understand "123 Main Street" or "Central Park." They work with numbers and more specifically, with latitude and longitude coordinates.

That's where geocoding comes in: Geocoding is the process of translating human-readable addresses into geographic coordinates that computers can use to pinpoint locations on a map. It's the invisible bridge between how we describe places and how machines navigate, analyze, and visualize them.

A geocoding example: 57 Rue de Matignon, 75007 Paris, France → 48° 51′ 15″ N, 2° 19′ 14,36″ E
A simple geocoding example.

This guide will walk you through what geocoding is, how it works behind the scenes, what can go wrong, and how to get the best results. Let's dive in! 🙂

Note: You don't need to be technical to use geocoding well. Understanding a few key concepts will help you troubleshoot issues and make better decisions.


Table of contents:


The two ways of geocoding

Geocoding actually works in two directions, and each serves a different purpose.

Forward geocoding: turning addresses into coordinates

Forward geocoding takes a text address (like 57 Rue de Varenne, Paris, France) or place name (like Tour Eiffel) and returns a pair of coordinates (latitude and longitude).

This is the exact process that happens when you search for a location in Google Maps or Apple Maps on your phone:

Google Maps search on the phone: 57 Rue de Varenne, Paris, France
In Google Maps, typing "57 Rue de Varenne, Paris, France" and getting the coordinates is a forward geocoding process.

Forward geocoding is an essential tool for a wide variety of operations. I don't want to sound too boring already by listing all the examples, but here are just a few.

For consumers:

  • Finding places: Forward geocoding lets you search for locations in navigation apps, just type a name or address to find a coffee shop, pharmacy, or tourist site.
  • Getting directions: All navigations apps use geocoding to get the coordinates of the destination and display it on the map. Then, the best route is computed by the navigation engine.
  • Package delivery: Addresses are sent to the delivery driver to help them find your address and drop off your package.

For businesses:

  • Address validation: Online stores and delivery services geocode addresses to check and standardize them at checkout.
  • Route optimization: Companies geocode address lists to plan deliveries efficiently and save resources.
  • Customer analytics: Businesses geocode customer addresses to spot trends and improve marketing or logistics.
  • Store locators: Retailers use geocoding so customers can easily find the nearest shop from their address.

And this list is not exhaustive. Yes, geocoding is everywhere! 😀

Reverse geocoding: turning coordinates into addresses

Reverse geocoding does the opposite: you start with a GPS point (latitude and longitude) and get back the nearest address or place name.

That's what happens when you pin a place on the map in Google Maps or Apple Maps:

Google Maps reverse geocoding example
In Google Maps, pinpointing a location and getting a suggestion for the address is a reverse geocoding process.

Again, here are some real world examples:

  • When you drop a pin on a map to show a friend where you are, the app reverse geocodes that point to display 57 Rue de Varenne, Paris, France instead of 48.8589, 2.3514.
  • Photos taken with your smartphone often store GPS coordinates in their metadata. Apps reverse geocode those coordinates to label your photos with Central Park, New York or Eiffel Tower, Paris.
  • A delivery driver scans a package at your door. The handheld device logs the GPS coordinates at that moment, and the system reverse geocodes it to confirm delivery at the correct address.

Good to know: In most cases, when people mention "geocoding", they’re talking about forward geocoding. If reverse geocoding is meant, it’s usually specified directly.


How geocoding works (simplified version)

Now let's walk through what happens when you provide an address to a geocoding system. Understanding these steps will help you recognize why some addresses work perfectly and others struggle.

Basically, geocoding involves matching an input address to the most similar entry in a large address database. In other words, it's a process of looking up an address in a database:

Address lookup table illustration
Address lookup: the geocoding system checks your input against a massive database of known addresses and places to find the best match.

When I type 57 ru de varenne Paris in Google Maps (yes, with the typo), the system will compare this input with Google's database of addresses and POIs (Points of Interest) and will return the closest match.

This simple address lookup is made of 4 steps:

  1. Address normalization
  2. Parsing
  3. Database lookup
  4. Scoring and ranking

Let's go through each step in detail.

Step 1: Address normalization

Addresses can be entered in many different ways: with typos, all uppercase, extra or missing information (like country or postal code), or with unnecessary details.

The normalization process aims to standardize addresses so they can be easily compared to those in the database. Here are some typical normalization steps:

  • Standardization: Convert to lowercase, remove extra spaces and symbols, etc. Example: 57 , AV des Champs-élysées (paris) becomes 57 avenue des champs elysees Paris.
  • Expanding abbreviations: 57 R de Varenne may be expanded to 57 Rue de Varenne.
  • Correcting small typos: Obvious small errors can be fixed, e.g., 57 Reu de Varenne becomes 57 Rue de Varenne.
  • Removing filler words: We can remove words like "de", "la" (in french) or "of", "the", "in", e.g. 57 Rue de la Varenne in Paris could become 57 Rue Varenne Paris.

Note: The amount and type of normalization and cleaning depends on the geocoding provider. Google Maps will have different rules than HERE, and Mapbox, and so on. This is part of each provider's "secret sauce", and that's why a provider can fail where another succeeds.

Step 2: Parsing components

Once the address is normalized, we can try to parse it, i.e., split it into recognized component parts: house number, street name, unit/apartment number, city, postal code, region/state, and country.

Address parsing: the geocoding system splits the address into recognizable components.
Address parsing: the geocoding system splits the address into recognizable components.

This is needed in order to specify the search. When I type 57 rue de Varenne paris, should I check all addresses in the World? Probably not!

Because Paris strongly suggests that we are in France (or at least, it's highly probable... we could, for example, also be in Paris, Texas, United States 😅).

Parsing works by identifying patterns. For example:

  • Countries are usually placed at the end of the address (like in 57 Rue de Varenne, Paris, France)
  • Postal codes are usually codes between 3 to 10 digits (or characters), depending on the country
  • Housenumbers are usually the first numbers encountered

But all these rules can be unreliable, and the geocoding system must deal with a lot of inconsistencies.

  • Missing components: Sometimes components are missing (I didn't type the country and the postal code for my search)
  • Abbreviations: People write St (street? saint?), Av (avenue? aviation?), Dr (drive? doctor?). The system must use context to decide.
  • Different country formats: In the U.S., postal codes come at the end. In the UK, they're often in the middle. In Japan, the structure is completely different (country, postal code, prefecture, city, district, block, building).
  • Multiple languages: Rue (French), Calle (Spanish), Rua (Portuguese), Ulica (Polish) — they all mean "street," and the system needs to recognize them.

As an example, Google Maps still works well if I type 75015 Paris - Rue de Varenne 57 even if it does not respect the usual format.

Why parsing is powerful: Once the address is parsed, the geocoding engine can use the country or state or city to filter out billions of irrelevant addresses. A postal code narrows it even further often to just a few hundred possibilities. This is how geocoding systems stay fast and accurate even when dealing with enormous datasets.


Step 3: Database lookup - finding the match

Now that we have clean, parsed components, it's time to look them up in a reference database. This is an example of process:

  1. Take only addresses within the same country or state or city (if available, let's say France for our example)
  2. Find addresses that match the most of our components: house number, street name, city, postal code.
Address lookup table illustration for components
Illustration of the address lookup by components.

In this above example, we find 4 possible matches (that have at least one component in common with our input):

  • 57 Rue de Varenne, Riom, France
  • 57 Rue de Varenne, Paris, France
  • 58 Rue de Varenne, Paris, France
  • 57 Av. des Champs-Elysées, Paris, France

For the sake of simplicity, we used a strict matching, i.e. we only match if the input is exactly the same as the database. However, in reality, geocoding systems often use distance functions, like the Levenshtein distance, which helps match addresses even when they contain typos or abbreviations (e.g. 57 Rue de Varene with a missing letter n).

Tokenization: Some geocoding systems use a process called tokenization, which means splitting an address into smaller pieces (tokens), usually words. Instead of trying to match the entire street name as one string, the system can match individual tokens. For example, champs elysées might be split into tokens like champs, elysées, champs elysées, and champs-elysées. This approach increases the chances of finding a match even if the address format varies or contains minor errors. Geocoding engines use different strategies to generate and compare tokens: some rely on phonetic similarity, others use techniques like stemming (reducing words to their base form) or lemmatization. The exact methods may vary between systems.

You also now understand that the goal is to have the most complete list of addresses possible but also to include street names, administrative boundaries, cities, neighborhoods, POIs, and more. This is also why some providers can give much different results than others. Commercial providers like Google or Apple have their own private data, whereas some well-known open source initiatives like OpenStreetMap and OpenAddresses.io can give you good coverage of the world.

Google Maps, for instance, excels with commercial data like restaurant names or shop names. The US Census geocoder will never be able to match a restaurant name with a location; it's simply not in their database.


Step 4: Scoring and ranking - choosing the best match

As we saw, the database lookup often returns multiple candidates. Maybe there are two 57 Rue de Varenne addresses in neighboring towns. Or the postal code matches perfectly, but the street name is slightly off. How does the system decide which one to return?

If you look at the example above, it's clear that not all matches are equal. Some are more likely to be the correct one than others. If our input is 57 Rue de Varenne, Paris, France, we have:

  • 57 Rue de Varenne, Riom, France: same housenumber & street name but different city
  • 57 Rue de Varenne, Paris, France: same housenumber & street name & city
  • 58 Rue de Varenne, Paris, France: same street name & city
  • 57 Av. des Champs-Elysées, Paris, France: same housenumber & city

It becomes obvious that some components are more important than others, i.e. matching the right city is probably more important than matching the right housenumber. For our example, let's attribute a score with ponderationnal weighting:

  • 1 point for the housenumber
  • 2 points for the street name
  • 3 points for the city

The result would be:

  • 57 Rue de Varenne, Riom, France: 1 + 2 = 3 points
  • 57 Rue de Varenne, Paris, France: 1 + 2 + 3 = 6 points 🏆
  • 58 Rue de Varenne, Paris, France: 2 + 3 = 5 points
  • 57 Av. des Champs-Elysées, Paris, France: 1 + 3 = 4 points

The database lookup with 57 Rue de Varenne, Paris, France has 6 points, being the highest score, so that's the one that will be returned ! 🎉

Good to know: It's a way for geocoding providers to score their results and return confidence scores. Some may use different weighting, some may use different distance functions, etc. But you get the idea. Again, it's a way for some geocoding providers to achieve better performance than others.

Coming up with a good scoring system isn't easy, and there's no single right answer. Is the city name always worth 3 points, or should that change if the name's a bit ambiguous? What about situations where parts of the address are missing or not very reliable? Tweaking the weights for each component can have a big effect on your results, and it often depends on where your data is from and the types of addresses you get.

In practice, building these ranking rules usually involves a lot of testing, tweaking, and sometimes even a dash of machine learning to get the best possible matches.


What could go wrong? Understanding geocoding failures

Now that you understand how geocoding works, let's talk about what could go wrong. You will be surprised to see how often geocoding fails, and how often it's not obviously stated by the geocoding provider 😅.

Problem #1: Database issues (missing, incorrect, ...)

Geocoding systems are only as good as the data they use. If the underlying database has missing information or is out of date, even perfect input won't produce accurate results.

Missing data:

  • Rural areas: A farmhouse on a county road might not have a specific street address in the database. The geocoder might place it at the road centerline or at the entrance to the property close, but not exact.
  • New construction: A brand-new apartment building might not yet appear in the address database. It could take weeks or months for mapping providers to update their data after construction is complete.
  • Private roads and gated communities: These are often excluded from public datasets. If you're trying to geocode an address inside a private community, the system might only get you to the main entrance.

Ambiguous data:

  • Street renaming: A city renames Washington Avenue to Martin Luther King Jr. Boulevard. For months, the old name still appears in the database, and geocoding requests using the new name fail or return low-confidence results.
  • Multiple languages: Some addresses must exist in multiple languages, e.g. 9 Rue des pensées, 1030 Schaerbeek should as well exist as Penseestraat 9, 1030 Schaarbeek (street name and even city name are different). They both represent the exact same address, but with different languages (French and Dutch).

How to deal with this issue? You must not just trust the geocoding provider's result. You must always check the result and if it's not correct, you must try to correct it manually. Coordable automatically detects these issues for you.

Example 1: Google Maps

At the time of writing (2025, October), Google Maps does not have the address 12 rue sur le rang, Bourogne in France. Instead, it would insist in returning the address 12 Rue Vivaldi, 90140 Bourogne, France.

Google Maps wrong example
Google Maps wrong example: the address 12 rue sur le rang, Bourogne is not located correctly.
Mapbox is correct because it probably uses the open source French database for this example.

This example is not that bad in the sense that the result is pretty close to the correct location. However, that's still a false positive: a result that is returned and is confident about it, but it's not the correct one.

Example 2: HERE wrong example

To be fair, all geocoding providers have their flaws. Here is another example with HERE, where Google actually works better.

In this example, we geocode the address 38 b faubourg de montbéliard, Delle. It's still unsure if there are two addresses with the same housenumber there (38 and 38B), but HERE returns something completely wrong with a high score (1.0).

HERE wrong example
HERE wrong example: the address 38 b faubourg de montbéliard, Delle is not located correctly.
Google Maps is correct in this case.

The problem here seems a lack of update in the open source French database. As HERE uses it, it also inherits its errors.


Problem #2: Parsing failures when the system can't make sense of the input

Most of the time, addresses are messy. They contain extra noise, typos, mixed languages, and other formatting issues.

But sometimes, the parser can't understand the input at all:

  • CALL JOHN AT 555-239-4829 120 MAIN STREET APT 5B, BROOKLYN, NY, USA: too many noise, phone number, instructions, etc.
  • 1234ElmStreetLosAngelesCA90001: no spaces or wrong order, no city or postal code.
  • ru Saint cathrine West, Montreal QC H3B 1B5 Canda: typo, mixed languages, etc.
  • RUE DES F?TES, 75019 PARIS: encoding issue, special character (should be Rue des Fêtes)

Try these examples with your favorite maps application. You would be surprised to see how often they fail.

Parsing errors are a real problem: businesses processing thousands of global addresses using poorly designed parsers often see failed deliveries or mismatched addresses, all due to small parsing errors that could have been avoided with better cleaning and normalization.

How to deal with this issue? Cleaning the input is a good way to improve the results. E.g. if you know that your addresses have encoding issues, fix it before sending it to the geocoding provider. Coordable has automated AI cleaning that deals with all the above issues and more.

Example 1: basic encoding issue

In this example, the address RUE DES F?TES, 75019 PARIS cannot be geocoded correctly in both Mapbox and Nominatim (the OpenStreetMap geocoder).

Encoding issue example 1
Illustration of an encoding issue: the address RUE DES F?TES, 75019 PARIS is not properly geocoded.

In both cases, they will return the correct location if f?tes is replaced by fêtes.

Example 2: noise issue

In this example, we consecutively try CALL JOHN AT 555-239-4829 120 MAIN STREET APT 5B, BROOKLYN, NY, USA and MAIN STREET APT 5B, BROOKLYN, NY, USA in Google Maps.

Encoding issue example 2
Example of noise: extra phone numbers and instructions prevent the geocoder from finding the correct address.

The result is that noise is not being filtered by Google Maps (which has, to be honest, one of the best parsers out there). Is it Google Maps fault? No, probably not. At this point, we should have cleaned the input ourselves.

You may think it's a silly example, but it's not 🫠. It happens a lot to see mixed fields because of human errors.

Example 3: typos

In this example, we try to geocode the address ru Saint cathrine West, Montreal QC H3B 1B5 Canda in HERE and Mapbox. It's a mix of French and English with many typos. A correct address could be Rue Sainte-Catherine Ouest, Montreal QC H3B 1B5 Canada or St-Catherine West, Montreal QC H3B 1B5 Canada.

Typo issue example 1
Example of typos and mixed languages that can cause geocoding failures.

HERE returns the correct address, but Mapbox can't find it and returns Montreal, QC, Canada. It's interesting to play around with the suggestions in the HERE dropdown. It shows that both English and French versions are available, and it's not affected by the typo ru instead of Rue.


Best practices: how to get reliable geocoding results

Now that you understand how geocoding works and what can go wrong, let's talk about how to improve your results 😎. Whether you're a consumer troubleshooting a single address or a business processing thousands, these practices will help.

1. Pick the right geocoding provider

You can get better results by picking the right geocoding provider.

Why: All providers rely on different data, different parsing rules, different scoring systems, etc. This makes many ways to differentiate them.

How: You should try different geocoding providers and see which one works best for you. You can also try to use the geocoding provider's API to get the best results.

Good to know: on Coordable, you can select the geocoding provider from a dropdown in the address input field.

2. Always add context (country, region, postal code)

Geocoding accuracy improves dramatically when you provide strong context: country, state/region, and especially postal code.

Why: Cities, streets, and even building names can be duplicated across the world (or even within a single country). "Paris" could mean Paris, France or Paris, Texas. "There are thousands of streets named Main Street in the United States alone."

How: At least, always concatenate the country to the address. It's a quick win. And if possible, a postal code is a strong hint of the correct location.

3. Clean your addresses before you geocode

Messy, incomplete, or inconsistent addresses are the #1 cause of poor geocoding results. Cleaning (a.k.a. "address normalization") means fixing typos, filling in missing parts, converting to a canonical format, and removing extraneous noise.

Why: Geocoding engines often struggle with extra info ("Attn: John", "call before delivery", phone numbers), bad capitalization, spelling/formatting variations, or misplaced address parts. Clean inputs = far fewer errors.

How: You can implement some of the rules we presented earlier in this article: normalization, replacement of abbreviations, encoding issues, etc.

Tools: Basic cleaning can be handled with spreadsheet formulas, or specialized software/APIs like Coordable, libpostal, or your geocoding provider's normalization endpoint.

4. Help the parser if possible

If possible, provide the information of your address components to the geocoding provider. This can help the parser to understand the address better.

Why: Parsing is one of the most important steps in geocoding. If it's failing, you will get poor results.

How: Some geocoding providers can accept a structured address input like a JSON object with the address components: {house_number: "57", street: "Rue de Varenne", city: "Paris", state: "Île-de-France", country: "France", postal_code: "75015"}.

5. Understand what you are geocoding

Are you sending residential addresses or POIs? Do you have mostly street names alone or do you have house numbers?

Why: Understanding what you are sending can help you to understand the quality of your results. It's obvious, but if only 50% of your addresses are residential addresses, you should expect less than 50% of results with housenumbers.

How: You can implement your own rules to classify addresses. For instance, you could write a regex rule to parse postal codes and housenumbers.

Good to know: when uploading a dataset or starting a geocoding job on Coordable, we automatically classify the addresses for you.

6. Don't trust results blindly

Geocoding systems return results even when they're not confident. If you blindly accept the first result, you might get the wrong location. Even the confidence score is not a guarantee of the correctness of the result.

Why: The confidence score is not a guarantee of the correctness of the result. It's a score given by the geocoding provider to indicate the confidence in the result. It's not a guarantee of the correctness of the result.

How to avoid it: Implement your own rules to measure accuracy and correctness of the result, based on the information the provider gives you, as well as your own manual verification.

Good to know: the Coordable platform embeds a powerful verification engine that can help you quickly identify false positives and incorrect results.


Conclusion

I hope you enjoyed this guide. I tried to explain how geocoding works in a simple way, with examples and practical tips to improve your results.

A lot of technical details are omitted, such as tokenization, distance functions, or scoring rules. I will probably write a follow-up article to cover these topics. But at least it presents the big picture of a geocoding process.

If you are a business and want to improve your geocoding results, please contact us at contact@coordable.co.

After that, if you want to learn more, here are some resources that I found interesting:

Happy geocoding! ⭐