Less Is More: Analyzing Text Abstraction Levels for Gender and Age Recognition Across Question-Answering Communities

In social networks like community Question-Answering (cQA) services, members interact with each other by asking and answering each other’s questions. This way they find counsel and solutions to very specific real-life situations. Thus, it is safe to say that community fellows log into this kind of s...

Full description

Saved in:
Bibliographic Details
Main Author: Alejandro Figueroa
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/16/7/602
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In social networks like community Question-Answering (cQA) services, members interact with each other by asking and answering each other’s questions. This way they find counsel and solutions to very specific real-life situations. Thus, it is safe to say that community fellows log into this kind of social network with the goal of satisfying information needs that cannot be readily resolved via traditional web searches. And in order to expedite this process, these platforms also allow registered, and many times unregistered, internauts to browse their archives. As a means of encouraging fruitful interactions, these websites need to be efficient when displaying contextualized/personalized material and when connecting unresolved questions to people willing to help. Here, demographic factors (i.e., gender) together with frontier deep neural networks have proved to be instrumental in adequately overcoming these challenges. In fact, current approaches have demonstrated that it is perfectly plausible to achieve high gender classification rates by inspecting profile images or textual interactions. This work advances this body of knowledge by leveraging lexicalized dependency paths to control the level of abstraction across texts. Our qualitative results suggest that cost-efficient approaches exploit distilled frontier deep architectures (i.e., DistillRoBERTa) and coarse-grained semantic information embodied in the first three levels of the respective dependency tree. Our outcomes also indicate that relative/prepositional clauses conveying geographical locations, relationships, and finance yield a marginal contribution when they show up deep in dependency trees.
ISSN:2078-2489