Amazon1996-2023美肤类商品评论数据集

摘要:

合集:AI案例-NLP-零售业
数据集:Amazon(1996~2023)美肤类商品评论数据集
数据集价值:情绪分析文本分类

一、问题描述

情绪分析是最常见的文本分类工具。这个过程会分析文本片段以确定情绪倾向是积极的、消极的还是中性的。在监控在线会话时了解你的品牌、产品或服务引发的社会情绪是现代商业活动的基本工具之一,而情绪分析是实现这一目标的第一步。

这个数据集可以为任何产品创建情绪分析的入门模型,你可以使用它来快速创建可用于生产的模型。

二、数据集内容

商品信息

文件:meta_All_Beauty.jsonl

内容:对于Amazon美肤类商品,从1996/5到2023年9月的商品信息。

数据结构

FieldTypeExplanation
main_categorystrMain category (i.e., domain) of the product.
titlestrName of the product.
average_ratingfloatRating of the product shown on the product page.
rating_numberintNumber of ratings in the product.
featureslistBullet-point format features of the product.
descriptionlistDescription of the product.
pricefloatPrice in US dollars (at time of crawling).
imageslistImages of the product. Each image has different sizes (thumb, large, hi_res). The “variant” field shows the position of image.
videoslistVideos of the product including title and url.
storestrStore name of the product.
categorieslistHierarchical categories of the product.
detailsdictProduct details, including materials, brand, sizes, etc.
parent_asinstrParent ID of the product.
bought_togetherlistRecommended bundles from the websites.

数据样例

样例:

{"main_category": "All Beauty", "title": "Howard LC0008 Leather Conditioner, 8-Ounce (4-Pack)", "average_rating": 4.8, "rating_number": 10, "features": [], "description": [], "price": null, "images": [{"thumb": "https://m.media-amazon.com/images/I/41qfjSfqNyL._SS40_.jpg", "large": "https://m.media-amazon.com/images/I/41qfjSfqNyL.jpg", "variant": "MAIN", "hi_res": null}, {"thumb": "https://m.media-amazon.com/images/I/41w2yznfuZL._SS40_.jpg", "large": "https://m.media-amazon.com/images/I/41w2yznfuZL.jpg", "variant": "PT01", "hi_res": "https://m.media-amazon.com/images/I/71i77AuI9xL._SL1500_.jpg"}], "videos": [], "store": "Howard Products", "categories": [], "details": {"Package Dimensions": "7.1 x 5.5 x 3 inches; 2.38 Pounds", "UPC": "617390882781"}, "parent_asin": "B01CUPMQZE", "bought_together": null}

{"main_category": "All Beauty", "title": "Yes to Tomatoes Detoxifying Charcoal Cleanser (Pack of 2) with Charcoal Powder, Tomato Fruit Extract, and Gingko Biloba Leaf Extract, 5 fl. oz.", "average_rating": 4.5, "rating_number": 3, "features": [], "description": [], "price": null, "images": [{"thumb": "https://m.media-amazon.com/images/I/41b+11d5igL._SS40_.jpg", "large": "https://m.media-amazon.com/images/I/41b+11d5igL.jpg", "variant": "MAIN", "hi_res": "https://m.media-amazon.com/images/I/71g1lP0pMbL._SL1500_.jpg"}, {"thumb": "https://m.media-amazon.com/images/I/41j2ocUzCtL._SS40_.jpg", "large": "https://m.media-amazon.com/images/I/41j2ocUzCtL.jpg", "variant": "PT01", "hi_res": "https://m.media-amazon.com/images/I/81OqvR94isL._SL1500_.jpg"}], "videos": [], "store": "Yes To", "categories": [], "details": {"Item Form": "Powder", "Skin Type": "Acne Prone", "Brand": "Yes To", "Age Range (Description)": "Adult", "Unit Count": "10 Fl Oz", "Is Discontinued By Manufacturer": "No", "Item model number": "SG_B076WQZGPM_US", "UPC": "653801351125", "Manufacturer": "Yes to Tomatoes"}, "parent_asin": "B076WQZGPM", "bought_together": null}

用户评论数据

文件:All_Beauty.jsonl

内容:对于Amazon美肤类商品,从1996/5到2023年9月的商品评论信息。

数据结构

FieldTypeExplanation
ratingfloatRating of the product (from 1.0 to 5.0).
titlestrTitle of the user review.
textstrText body of the user review.
imageslistImages that users post after they have received the product. Each image has different sizes (small, medium, large), represented by the small_image_url, medium_image_url, and large_image_url respectively.
asinstrID of the product. (asin – Amazon Standard Identification Number)
parent_asinstrParent ID of the product. Note: Products with different colors, styles, sizes usually belong to the same parent ID. The “asin” in previous Amazon datasets is actually parent ID. Please use parent ID to find product meta.
user_idstrID of the reviewer
timestampintTime of the review (unix time)
verified_purchaseboolUser purchase verification
helpful_voteintHelpful votes of the review

数据样例

样例:

{"rating": 5.0, "title": "Such a lovely scent but not overpowering.", "text": "This spray is really nice. It smells really good, goes on really fine, and does the trick. I will say it feels like you need a lot of it though to get the texture I want. I have a lot of hair, medium thickness. I am comparing to other brands with yucky chemicals so I'm gonna stick with this. Try it!", "images": [], "asin": "B00YQ6X8EO", "parent_asin": "B00YQ6X8EO", "user_id": "AGKHLEW2SOWHNMFQIJGBECAF7INQ", "timestamp": 1588687728923, "helpful_vote": 0, "verified_purchase": true}
{"rating": 4.0, "title": "Works great but smells a little weird.", "text": "This product does what I need it to do, I just wish it was odorless or had a soft coconut smell. Having my head smell like an orange coffee is offputting. (granted, I did know the smell was described but I was hoping it would be light)", "images": [], "asin": "B081TJ8YS3", "parent_asin": "B081TJ8YS3", "user_id": "AGKHLEW2SOWHNMFQIJGBECAF7INQ", "timestamp": 1588615855070, "helpful_vote": 1, "verified_purchase": true}

数据集引用要求

@article{hou2024bridging,
title={Bridging Language and Items for Retrieval and Recommendation},
author={Hou, Yupeng and Li, Jiacheng and He, Zhankui and Yan, An and Chen, Xiusi and McAuley, Julian},
journal={arXiv preprint arXiv:2403.03952},
year={2024}
}

四、获取案例套装

需要登录后才允许下载文件包。登录 需要登录后才允许下载文件包。登录

发表评论