I don’t know how widespread this issue is, but for Outback Steakhouse at least, I am receiving a large number of duplicate items. This happens for both GET and POST requests. For example, look at items with IDs 513fc993927da70408001a4b and 513fc9c6673c4fbc260024b3. They’re both “Classic Blue Cheese Wedge Salad (Dressing Included)” and all the fields are identical except the former is missing an item_description and an old_api_id. Interestingly, they even have identical updated_at values.
This is the case for most items at Outback Steakhouse. I haven’t checked a large selection of restaurants, but it’s concerning that this is happening for one of the first restaurants I tried.
Am I doing something wrong with my query, or are there just duplicate data in your database? If so, and I need to cull the duplicates on my end, what is the proper method of culling? Discard anything missing an item_description if there’s an item with the same name containing an item_description? Discard anything without an old_api_id? I could add these restrictions to my query so I don’t get so many results back (and so I can avoid so many hits to your database), but I’d be afraid of accidentally culling an item that doesn’t have a duplicate.
This does not appear to be a widespread issue, but we are checking into how the items were duplicated now. Thanks for reporting this! Will respond back with an update shortly.
I have found the same thing. A simple search for milk returns (example)
2 – Milk, Essential Everyday
3 – Milk, Hannaford
4 – Milk, Hood
2 – Milk, Berkeley Farms
All 1 cup servings
Grocery items may have duplicate names due to different brands having the same item. For example, the items you listed are milks from 4 different brands.
You can optionally filter the API results on the client side to show only unique names in each set of results, but when dealing with 500K different food products, there are not always distinct names possible for each item.
I found the difference was the number of servings per container. I can understand that.
But the one I’m having a real question is brand_name = Quest Bar and item_name = Protein Bar, Chocolate Chip Cookie Dough. It comes back with 21 entries. Doing a SELECT DISTINCT produces 8 rows. They are all 1 bar per container but have 3 different calorie counts 170, 190 and 210. The biggest difference I found was the updated_at. BUT if I select the most current (last updated) it shows 170 calories which does not match nutrition facts on the Quest Nutrition website.
can you post the item IDs and I will investigate further. Thanks!