TY - JOUR
T1 - Extracting Complements and Substitutes from Sales Data
T2 - A Network Perspective
AU - Tian, Yu
AU - Lautz, Sebastian
AU - Wallis, Alisdair O. G.
AU - Lambiotte, Renaud
N1 - Funding Information:
YT is supported by the EPSRC Centre for Doctoral Training in Industrially Focused Mathematical Modelling (EP/L015803/1) in collaboration with Tesco PLC.
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/8/25
Y1 - 2021/8/25
N2 - The complementarity and substitutability between products are essential concepts in retail and marketing. Qualitatively, two products are said to be substitutable if a customer can replace one product by the other, while they are complementary if they tend to be bought together. In this article, we take a network perspective to help automatically identify complements and substitutes from sales transaction data. Starting from a bipartite product-purchase network representation, with both transaction nodes and product nodes, we develop appropriate null models to infer significant relations, either complements or substitutes, between products, and design measures based on random walks to quantify their importance. The resulting unipartite networks between products are then analysed with community detection methods, in order to find groups of similar products for the different types of relationships. The results are validated by combining observations from a real-world basket dataset with the existing product hierarchy, as well as a large-scale flavour compound and recipe dataset.
AB - The complementarity and substitutability between products are essential concepts in retail and marketing. Qualitatively, two products are said to be substitutable if a customer can replace one product by the other, while they are complementary if they tend to be bought together. In this article, we take a network perspective to help automatically identify complements and substitutes from sales transaction data. Starting from a bipartite product-purchase network representation, with both transaction nodes and product nodes, we develop appropriate null models to infer significant relations, either complements or substitutes, between products, and design measures based on random walks to quantify their importance. The resulting unipartite networks between products are then analysed with community detection methods, in order to find groups of similar products for the different types of relationships. The results are validated by combining observations from a real-world basket dataset with the existing product hierarchy, as well as a large-scale flavour compound and recipe dataset.
KW - Market basket analysis
KW - Network modelling
KW - Product relationships
KW - Role extraction
KW - Sales data
UR - http://www.scopus.com/inward/record.url?scp=85113384838&partnerID=8YFLogxK
U2 - 10.1140/epjds/s13688-021-00297-4
DO - 10.1140/epjds/s13688-021-00297-4
M3 - Article
SN - 2193-1127
VL - 10
JO - EPJ Data Science
JF - EPJ Data Science
IS - 1
M1 - 45
ER -