ChannelEngine: address parsing
About this article
This article describes ChannelEngine's address parsing efforts, as well as examples of how common addresses are dealt with, and how to disable related settings.
Table of contents
Disable house number validation
Convert diacritical signs to regular characters
Default shipment/billing address
What to do if an address is not parsed or incorrectly?
Introduction
While many marketplaces use a system to input the separate fields of a buyer's address, almost all marketplaces export these as address lines. However, several systems connected to ChannelEngine rely on individual address fields.
ChannelEngine attempts to parse those address lines into individual address fields again. Next to that, marketplaces do not always expose all the relevant address information in the same way shown in their back-end. E.g.: in some cases you only get the billing address, in other cases only the shipping address, etc.
Below you can find a few more reasons why address parsing is difficult:
- Not every marketplace validates the input received, so sometimes invalid data is placed and exported. This is usually caused by form auto-fills supported by most browsers.
- Addresses are not standardized, and most countries have their own format. E.g.: some countries have addresses without house numbers, some have house numbers in front of the addresses, some at the end, etc.
- Company names are sometimes added to addresses but aren't standardized either (so something like 123 can be a company name as well if it's placed on a separate address line).
- Additional information (e.g.: "deliver at the red door three houses over on Mondays and Thursdays, between 10:00 and 14:30") is sometimes added to addresses – which further complicates the parsing process, as it is not part of an official address.
Due to those and other reasons, address parsing is a very complex process that is not 100% reliable. While there are external tools available that can assist with or even replace ChannelEngine's address parser logic, none of those is a one-size-fits-all solution – often enough they are also country-specific. This is why ChannelEngine uses its own logic to parse addresses.
How does it work?
Most marketplaces have all their address information within one or two address lines. However, in certain scenarios, they are spread over three distinct lines. That is why you see three lines in total when you look up the original address.
ChannelEngine's address parser logic takes the address lines and applies all possible parsing scenarios available to them. It then verifies this with an alternative parser. This process has multiple possible outcomes:
- If it results in one combined result, it applies that specific sub-parser scenario to the address.
- If it cannot apply any possible scenario to the address lines (e.g.: if the input of all three lines is TEST TEST TEST), it sets the status of the order to 'Requires correction'. That is done because a manual check and correction are needed to ensure that the address fields are filled out correctly, and to throw a notification on ChannelEngine. More information on that status and what to handle it can be found in the article Why is my order labeled as 'Requires correction'?.
- If multiple scenarios are possible, it applies the first one that could be found. This could be wrong, depending on the input and context. Dutch addresses are usually formatted as street name + house number + house number addition, while a UK address might not have a house number – such as house name + area or building code. Both inputs have exactly the same format, within the same address line.
More detailed examples are available in the Examples section.
Where can I view the original address lines?
There are three different options to view the original address lines:
1. On the order detail page
When an order is saved by ChannelEngine and the related address is parsed (i.e.: it does not have the order status 'Requires correction'), an eye icon is visible by each individual address. Click on that icon to view the original address lines.
Once selected, a popup opens to show the original address lines.
2. When editing an address, in case a correction required
If the address lines on an order cannot be parsed and converted into individual address fields, the status of a new order is not 'New' but 'Requires correction'. If that happens, you can edit the current address fields by clicking on the pencil icon by the address.
You are then directed to a different page, where you can edit the address fields. At the top left section of the page, you can see the original address lines – and at the bottom of the page you can manually 'parse' the address lines, and update the individual fields.
3. Via the ChannelEngine API
The original address lines are retrievable via ChannelEngine's Merchant API. If you have an API connection with the Merchant API, you can ignore the address fields and use the original address lines. However, you have to parse the address yourself if that is necessary for your connected systems. These are visible as Line1, Line2, and Line3 in the order data.
{ "Content": [ { "Id": 3028, "ChannelName": "HomeDeco", "ChannelId": 1007, "GlobalChannelName": "HomeDeco", "GlobalChannelId": 1055, "ChannelOrderSupport": "SPLIT_ORDER_LINES", "ChannelOrderNo": "90329878748", "Status": "SHIPPED", "IsBusinessOrder": false, "CreatedAt": "2020-11-13T15:35:46.5950462+01:00", "UpdatedAt": "2020-11-13T16:06:00.827804+01:00", "MerchantComment": null, "BillingAddress": { "Line1": "Vondellaan 47 8th floor", "Line2": "ChannelEngine", "Line3": null, "Gender": "MALE", "CompanyName": "ChannelEngine", "FirstName": "Henk", "LastName": "De Vries", "StreetName": "Vondellaan", "HouseNr": "47", "HouseNrAddition": "8th floor", "ZipCode": "2332AB", "City": "Leiden", "Region": null, "CountryIso": "NL", "Original": null }, "ShippingAddress": { "Line1": "Rapenburg 1", "Line2": null, "Line3": null, "Gender": "FEMALE", "CompanyName": null, "FirstName": "Anna", "LastName": "De Vries", "StreetName": "Rapenburg", "HouseNr": "1", "HouseNrAddition": null, "ZipCode": "2311GG", "City": "Leiden", "Region": null, "CountryIso": "NL", "Original": null },
Examples
Below you can find examples of why certain addresses are easily parsable and others are almost impossible to automatically parse to address fields. For a list of all possible exceptions that basic address rules can have, check out the page Falsehoods programmers believe about addresses.
Example 1 (normal address)
- Line 1 - ChannelEngine
- Line 2 - Vondellaan 47
- Line 3 - first door next to the pub
ChannelEngine tries to 'parse' this input to the relevant standardized fields, so the result from these address lines would most likely be:
- CompanyName - ChannelEngine
- StreetName - Vondellaan
- HouseNr - 47
What if the following address is submitted?
- Line 1 - ChannelEngine
- Line 2 - Vondellaan 47 8
- Line 3 - first door next to the pub
Is the extra 8 the floor number? Or did someone make a typo when submitting their house number?
What if the order of the address lines is different?
- Line 1 - first door next to the pub
- Line 2 - ChannelEngine Vondellaan
- Line 3 - 47
The first address line usually holds the company name, but this is just an order comment and the second line now holds both the real company name and the street. However, as company names and street names are not standardized, that is likely to result in an incorrectly parsed address. E.g.:
- CompanyName - first door next to the pub
- StreetName - ChannelEngine Vondellaan
- HouseNr - 47
A person would probably notice that this is wrong, but if this is the way a marketplace submits this address it is hard to parse.
Example 2 (UK address without a house number)
Although it is not the only one, the UK can have challenging addresses because of their lack of a house number. E.g.:
- Line 1 - Newton Lodge
- Line 2 - Shirehampton
- Line 3
That address has no house number, but it does exist as a building. If you have a lot of orders from the UK, there are options to deal with it. If your logistical system accepts address lines or addresses without house numbers, there should be no problems – and you are advised to disable house number validation as seen in the Disable house number validation section.
If your logistical system cannot accept address lines or addresses without house numbers, do not disable the house number validation. Instead, leave the settings as they are. That results in a situation where every order without a house number triggers a 'Requires correction' status and notification, but it does give you the option to enter a placeholder number. Entering a 1 (one) is usually a viable strategy in that case, but that is up to you.
Example 3 (odd input)
- Line 1 - Vondellaan 47
- Line 2 - 47
- Line 3
That is a very common scenario. The reason why it happens is unclear, most likely invalidated input from the original buyer by having the browser automatically fill in the form data. However, in that example the house number is placed both at the first and second lines. Is it a house number addition (e.g.: 47-47)? Is it accidental? Is it a company name?
Example 4 (lots of possible input)
- Line 1 - Calle Pà dua #25 3º 1ª escalera derecha
- Line 2
- Line 3
That is an actual address in Spain, but it contains three different numeric values – with a letter or character attached within the same line. So what is the actual house number? Is it 25? And what to do with the additional information, that is, 3rd floor and 1st staircase to the right?
Example 5 (pick-up point)
- Line 1 - DHL Postnr. 866613948
- Line 2 - Postfiliale 538
- Line 3
That last example shows the address of a German DHL pick-up point. The address lines do not contain an actual existing street name, and if you do not use DHL as your own carrier you might encounter issues. ChannelEngine parses that as seen below, but there is no real correct address conversion as it is not a real address.
- CompanyName - DHL Postnr. 866613948
- StreetName - Postfiliale
- HouseNr - 538
Disable house number validation
Certain European countries, such as the UK, have addresses that do not have any form of house numbers. They are unique based solely on the combination of the street name and the house name. However, there are many of connected systems (e.g.: Adobe Commerce (née Magento), some ERPs, etc.) that are configured to only accept addresses with house numbers.
By default, ChannelEngine's address parser looks for a house number within the address lines to add as a house number. If it cannot find one, it results in a 'Requires correction', because there is no way for ChannelEngine to 'know' if that is a problem for the connected systems. If you have a system connected that can handle these addresses without a house number, and do not want to receive notifications regarding the manual correction of those addresses, you can disable this validation.
Setting
To disable or re-enable the house number validation, go to Settings, Settings, Address and under Skip house number validation, select applicable countries from the dropdown.
Do not forget to click on Save after enabling this setting. Once it is enabled, all future addresses containing no clear house number no longer trigger a 'Requires correction' status and notification.
Disable address parsing
Even with house number validation disabled, there can be 'exotic' input from marketplaces resulting in an un-parsable addresses. If you have a system connected that can use address lines without any validation in place, you may consider completely disabling the address parsing. By doing that, you ensure that there is never an order blocked due to requiring manual validation.
Setting
To disable or re-enable the address validation, go to Settings, Settings, Address and toggle the Disable address validation setting.
Do not forget to click on Save after enabling this setting. Once it is enabled, no address lines are parsed by ChannelEngine, and no order triggers a 'Requires correction' status and notification again. This also results in some addresses in the ChannelEngine web interface remaining empty, as they rely on individual address fields for displaying.
Convert diacritical signs to regular characters
Addresses imported from the marketplaces may contain fields with diacritical signs, such as ó, à, and ü. As these fields can be misinterpreted or not supported by your connected systems, ChannelEngine's address parser offers a way to convert diacritical signs into regular characters. E.g.: from Québec to Quebec. For that, navigate to Settings, Settings, Address and enable the Convert diacritics to regular characters setting.
Default shipment/billing address
Addresses imported from marketplaces may lack data, such as the customer's first or last name, gender, or other details. These fields are often required by your connected systems (e.g.: OMS, webstore), which can lead to interruptions in order exports to these systems. For example, Shopify requires all orders to include the complete first and last name.
To prevent any errors in the order export, go to Settings, Settings, and select Address from the right-hand side menu. In the Default shipment/billing address section, fill out the fields with placeholder data. The next time an order is missing address details, the system automatically populates missing fields with your entered data.
What to do if an address is not parsed or is parsed incorrectly?
As mentioned throughout this article, address parsing is a complex and imperfect system. While on average well over 99% of all orders processed by ChannelEngine have no issues with address parsing, there are unknown scenarios to the address parser logic. This is especially true as ChannelEngine continuously adds marketplaces across the globe, and processes addresses and input it has never dealt with before.
Contact ChannelEngine
If you encounter an address that is not correctly parsed and want to keep using ChannelEngine's address validation (i.e.: disabling it is not an option), send your example to ChannelEngine's Support team. Make sure to mention the orders concerned, including the original address lines, and what the expected outcome should be according to you.
Note that those sorts of improvements do not have the highest priority, and such requests are usually handled in batches. What's more, this is a delicate process which requires multiple tests – as adding a new scenario should not result in other address formats being incorrectly parsed. Therefore, your patience is appreciated.
Comments
0 comments
Article is closed for comments.