Sensing, predicting and exploiting consumer visual attention in fast-paced marketing environments

Project Objectives

Motivation: Visual-centric social media platforms (e.g., Instagram) are perhaps one of the main communication channels for promoting an advertisement campaign, exploited by brands for communicating with their target groups. During the design phase of advertising content, different choices are made affecting brand prominence (e.g., logo/text placement/size), with a clear goal of attracting consumer attention using top-down and bottom-up visual/cognitive stimuli. The effectiveness of each design choice is typically determined post-hoc, by analysing linked metadata (e.g., number of likes, comments) and computational marketing metrics (e.g., click-through rate, conversion rate). Although correlations between different design choices (set as independent variables) and objective metrics have been found (Yoo, 2022), they are still unsuitable for predicting the effectiveness of the created content in attracting consumer attention, during the design phase. The natural evaluation of the effectiveness of each design choice during this phase, should ideally include eye-tracking experiments (Wedel, 2022), for evaluating consumer attention, in the form of a 2D visual saliency map, clearly identifying the effects of top-down and bottom-up attention signals. Moreover, this analysis should be performed in a per-target group basis. This evaluation process for every possible ad design with the present technology is impractical, time-consuming, and expensive. To this end, automated saliency map estimation tools (Liu, 2022) from the content itself could have promising potential, however, existing methodologies are too generic, are not market specific and are perhaps only suitable for evaluating attention from a bottom-up only perspective, rendering them irrelevant to this task.

Ambition: CONVISE will design technology for predicting consumer attention maps from the visual advertisement content, from the collective and individual consumer attention perspectives. A novel pixel-level regression problem will be formulated; given an input visual ad, predict the potential consumer attention maps, without using any sensors. This problem definition is a market-relevant variant of the classic computer vision saliency detection problem, that can be solved with a supervised learning strategy, using deep learning. The performance of the newly developed technology will be evaluated in a CONVISE captured dataset in terms of Intersection over Union (IoU) (Yu, 2021) metric, targeting for an initial value of 60% IoU rate with a prediction tolerance rate of 20% (i.e., the difference between the predicted values and the ground truth values) (KPI OB1). The developed technology will be compared with state-of-the-art saliency detection methods, expecting performance gains of at least 15% (KPI OB2) in relevant IoU metrics, due to the increased market relevance of the designed technology.

Motivation: The effect of perceived ad personalization is invaluable, as relevant studies have shown that there is a strong correlation between personal ad relevance and click intention (De Keyzer, 2022). On the other side, ad personalization also induces perceived ad intrusiveness, leading consumers to intentionally ignore the ads. For instance, ad personalization using location seems to increase ad intrusiveness, while personalization based on consumer interests and demographics increases personal relevance (De Keyzer, 2018). Therefore, there is a need to define new “pervasive” content personalisation strategies, that do not infringe individual consumer privacy, since consumer demographics are personal data requiring explicit consumer consent. From the CONVISE perspective, common age/gender groups, might exhibit common consumer attention patterns as well, as it has been shown in personalized saliency detection experiments (Lee, 2021). In fact, recent marketing studies (Pfiffelmann, 2020) have correlated consumer visual attention (in the form of sensor extracted variables such as fixation count, average visit duration, total fixation duration) to ad personalization by a remarkable 76%. CONVISE attention prediction technology will be evaluated if it can be used to this end. In addition, since eye-tracking technology can also potentially be used to identify individuals (Brendan, 2021), privacy risks must be addressed and mitigated.

Ambition: CONVISE will evaluate a new ad personalization strategy, based on matching visual advertisements with the prediction of consumer attention. A new metric for predicting if the designed advertisement content is aligned with the intended target group will be developed, based on the correlation of the predicted visual ad attention map to the one for each the potential target group. The reliability of the newly develop ad relevance metric will be evaluated subjectively, by its correlation with individual/aggregated target group consumer declarative metrics, targeting at least 70% correlation (KPI OC1). Moreover, it will be evaluated objectively, by measuring the visual similarity of the individual/aggregated target group consumer attention map (using the sensor developed in Objective A) and the predicted ad attention map (developed in Objective B), targeting at coherence between the two options of at least 80% (KPI OC2). Finally, the developed attention sensing and predicting technology will be modified for disabling individual consumer recognition by sensor data analysis, according the k-anonymity principles (Mygdalis, 2021), without compromising the utility of attention data (KPI OC3).