The Amazon KDD Cup’24 competition presents a unique challenge by focusing on the application of LLMs in E-commerce across multiple tasks. Our solution for addressing Tracks 2 and 5 involves a comprehensive pipeline encompassing dataset construction, instruction tuning, post-training quantization, and inference optimization. The core of our strategy is EC-Guide specifically tailored for E-commerce scenarios. Notably, we heuristically integrated CoT reasoning to enhance the arithmetic capabilities of LLMs, resulting in improved performance in both Tracks. Please check our workshop paper for more details: “EC-Guide: A Comprehensive E-Commerce Guide for Instruction Tuning and Quantization by ZJU-AI4H”.
EC-Guide | EC-Guide · Datasets at Hugging Face |
---|
Task type | Sub-tasks | #Examples | Source |
---|---|---|---|
Generation | Product Elaboration (PE) | 479 | ecinstruct |
Product Question and Answer (PQA) | 6,834 | amazonqa | |
Category Recognition (CR) | 1,000 | amazonmetadata | |
Explaining Pair Fashion (EPF) | 3,000 | PairwiseFashion | |
Explaining Bought Together (EBT) | 2,315 | IntentionQA | |
Extract Review Keyphrase (ERK) | 1,000 | ecinstruct | |
Extract Product Keyphrase (EPK) | 3,000 | PairwiseFashion | |
Product Keyword Summarization (PKS) | 1,296 | esci, ecinstruct, amazonreview | |
Review Title Summarization (RTS) | 1,455 | amazonreview, Womens_Clothing_Reviews | |
Multilingual Translation (MT) | 2,997 | amazon-m2, flores | |
Multiple Choice Question (MCQ) | Select Product based on Attribute (SPA) | 520 | ecinstruct |
Select Attribute based on Product (SAP) | 1,385 | amazonreview | |
Product Relation Prediction (PRP) | 1,499 | ecinstruct | |
Query Product Relation Prediction (QPRP) | 2,150 | esci | |
Query Product Relation Judgement (QPRJ) | 501 | ecinstruct | |
Sentiment Analysis (SA) | 3,500 | ecinstruct, Womens_Clothing_Reviews | |
Product Keyword Summarization (PKS) | 271 | esci | |
Multilingual Description Matching (MDM) | 500 | amazonreview | |
Arithmetic and Commonsense Reasoning (ACR) | 7,184 | gsm8k, commonsenseqa | |
Retrieval | Inferring Potential Purchases (IPP) | 10,774 | ecinstruct, amazon-m2 |
Retrieving Review Snippets (RRS) | 810 | amazonreview | |
Retrieving Review Aspects (RRA) | 1,000 | amazonreview | |
Category Recognition (CR) | 7,500 | amazonmetadata | |
Product Recognition (PR) | 2,297 | amazonmetadata | |
Ranking | Query Product Ranking (QPR) | 4,008 | esci |
Named Entity Recognition (NER) | Named Entity Recognition (NER) | 7,429 | ecinstruct, amazonreview, product-attribute-extraction |
ALL | - | 74,704 | - |
Our EC-Guide dataset is manually created or generated by ChatGPT. Our source are from ECInstruct, amazonqa, productGraph, PairFashionExplanation (amazonmetadata), IntentionQA, Amazon-Reviews-2023, Shopping Queries Dataset (ESCI-data), womens-ecommerce-clothing-reviews, amazon-m2, flores, gsm8k, commonsense_qa, product-attribute-extraction and we thank them for their outstanding work.
@misc{EC-Guide,
title={EC-Guide: A Comprehensive E-Commerce Guide for Instruction Tuning and Quantization},
author={Zhaopeng Feng and Zijie Meng and Zuozhu Liu},
year={2024},
eprint={2408.02970},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2408.02970},
}