A Dual-Graph Clustering Approach for Selecting Data and Parameter Granularities
Fisher College of Business, Ohio State University
While there are well-established model selection methods (e.g., BIC), they commonly condition on a priori selected data and parameter granularities. That is, researchers think they are doing model selection, but what they are really doing is model selection conditional on their chosen granularities.
We propose a new method, Bayesian dual-graph clustering (BDGC), to make these two decisions along with standard model and parameter inference. BDGC entails representing data and parameters as two separate graphs with nodes (e.g., SKUs) being the unit of analysis. Then, (a) each graph is clustered using a covariate-driven distance function that allows for a high degree of interpretability for the underlying drivers and (b) data and parameter granularity posteriors are inferred akin to standard Bayesian model selection. BDGC can (c) handle large graphs and (d) accommodate parameter restrictions using a split-merge sampler, and (e) nest other extant methods (e.g., latent-class analysis).
We apply BDGC to a frequently purchased grocery category. The results show that BDGC choice of granularities, as compared to those from extant approaches, impact demand elasticities and optimal actions. We conclude by highlighting the generalizability of BDGC to a broad array of marketing problems.