A literature review of user privacy concerns in conversational chatbots: A social informatics approach: An Annual Review of Information Science and Technology (ARIST) paper
Abstract
Since the introduction of OpenAI's ChatGPT-3 in late 2022, conversational chatbots have gained significant popularity. These chatbots are designed to offer a user-friendly interface for individuals to engage with technology using natural language in their daily interactions. However, these interactions raise user privacy concerns due to the data shared and the potential for misuse in these conversational information exchanges. Furthermore, there are no overarching laws and regulations governing such conversational interfaces in the United States. Thus, there is a need to investigate the user privacy concerns. To understand these concerns in the existing literature, this paper presents a literature review and analysis of 38 papers out of 894 retrieved papers that focus on user privacy concerns arising from interactions with text-based conversational chatbots through the lens of social informatics. The review indicates that the primary user privacy concern that has consistently been addressed is self-disclosure. This review contributes to the broader understanding of privacy concerns regarding chatbots the need for further exploration in this domain. As these chatbots continue to evolve, this paper acts as a foundation for future research endeavors and informs potential regulatory frameworks to safeguard user privacy in an increasingly digitized world.
1 INTRODUCTION
In the digital age, the rapid advancements in artificial intelligence (AI) and natural language processing (NLP) have paved the way for transformative innovations, among which chatbots stand at the forefront. Chatbots are designed to interact with users through text-based or voice-based conversations, resembling the way humans communicate and revolutionizing the way individuals and organizations interact in various domains (Dilmegani, 2022). Conversational text-based chatbots, unlike voice-based chatbots, which rely on spoken language, interact with users primarily through written messages, often within messaging platforms, websites, or mobile applications. These chatbots use algorithms, machine learning (ML) techniques, and sentiment analysis to continuously improve their understanding of language nuances and user preferences over time to provide a more engaging and useful conversation experience for users (Følstad et al., 2018). Moreover, NLP technologies allow chatbots to collect, analyze, and understand the semantics, context, and intent behind user text, enabling them to generate more appropriate and contextually relevant responses (Adamopoulou & Moussiades, 2020; Khanna et al., 2015; Shawar & Atwell, 2007).
The chatbots have had a significant impact on the growing demand for personalized and efficient digital interactions in a range of sectors, including customer support, healthcare, education, and ecommerce (Hwang & Chang, 2021; Quiroga Pérez et al., 2020; Shawar & Atwell, 2007). They perform multiple tasks including answering frequently asked questions (FAQs), assisting users in finding information, scheduling appointments, and even offering emotional support. They also provide personalized, friendly (e.g., Costa, 2018), and efficient interactions, addressing user inquiries, providing information, offering recommendations, and performing tasks based on the conversation's context (Brandtzaeg & Følstad, 2017; Ranoliya et al., 2017). By utilizing these tools effectively, organizations can enhance user engagement and streamline communication processes (Følstad et al., 2018; Zamora, 2017) while ensuring that the quality of their offerings is maintained, thereby avoiding potential quality-related failures (Adam et al., 2020; Chong et al., 2021).
However, to engage with users, these chatbots need to collect users' different expectations, preferences, and cultural backgrounds (Balaji, 2019; Dev, 2022; Dev & Dev, 2023; Griffin et al., 2021; Hill et al., 2015; van Eeuwen, 2017). This personal data collection by chatbots raises significant privacy and security concerns (Adam et al., 2020; Chong et al., 2021; Waheed et al., 2022; Thomaz et al., 2020). Addressing these concerns within the context of social informatics (SI) not only safeguards individual rights but also lays the foundation for a more resilient and trustworthy technology ecosystem.
- What theoretical frameworks and methodologies have been proposed or utilized in the literature to understand the impact of conversational text-based AI chatbots on user privacy?
- What is the user privacy harms and risks that arise from interactions with text-based conversational AI chatbots?
- What are the gaps in understanding and future research directions regarding mitigating the impact of conversational text-based AI chatbots on user privacy?
To address these questions, this paper includes an extensive literature review focusing on text-based interactions of conversational AI chatbots published between 2017 and August 2023. This study's scope encompasses a selective review of literature, an analysis of theoretical frameworks and methodologies, and an exploration of privacy concerns arising from user interactions with conversational text-based AI chatbots.
2 BACKGROUND
2.1 History and evolution of conversational AI chatbots
In 1950, the original idea of a chatbot came from Alan Turing's test on “a machine's ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human,” which was later called Turing Test (Turing, 1950). In 1966, the first chatbot, ELIZA, which simulated a psychotherapist and would respond to users' inputs in question form, was developed (Weizenbaum, 1996). The conversational ability of this chatbot was based on limited knowledge and it often failed to answer both simple and complex questions, motivating users to seek out better chatbot models (Klopfenstein et al., 2017). After ELIZA, more advanced chatbots were developed and explored, with some still in use today. These include PARRY chatbot (1972) (Colby et al., 1971), Racter (1983), Jabberwacky (1988), Cleverbot (1995), A.L.I.C.E. (Artificial Linguistic Internet Computer Entity) chatbot (1995), and SmarterChild (2001) (Molnár & Szüts, 2018). These chatbots were mostly text-based and operated in a question and answer (Q&A) format wherein users provided inputs in the form of questions and the chatbot responded with appropriate answers or information. This Q&A format allowed users to engage in conversations or seek information from these chatbots.
In 1972, PARRY represented an advance because it could express emotional responses during conversations (Colby et al., 1972; Heiser et al., 1979). In 1988, AI was first used in the Jabberwacky chatbot with the introduction of contextual patterns that utilized previous human–chatbot conversational chat logs (Jabberwacky, 2019). In 1991, the term “Chatterbot,” was introduced (Adamopoulou & Moussiades, 2020). Chatterbot, later abbreviated to “chatbot,” was an artificial player discovered by Michael Mauldin within the TINYMUD, a multiplayer, real-time virtual world designed primarily for engaging in conversations (Mauldin, 1994). During this time, voice-based chatbots were explored, such as Dr. Sbaitso (Sound Blaster Artificial Intelligent Text to Speech Operator) (1992) (Dr. Sbaitso, 2019). This voice-based chatbot had a unique purpose. It was created to showcase the capabilities of sound cards by generating digitized voices, demonstrating the potential of these audio devices (Dr. Sbaitso, 2019). This chatbot worked well for psychologists due to the less complicated interactions (Zemäcik, 2019). The first online chatbot, A.L.I.C.E., relied on a simple pattern-matching algorithm based on AI Markup Language (AIML) (Marietto et al., 2013). It did not have the capability to fully understand a conversation (Marietto et al., 2013). Even so, in 2000, 2001, and 2004, it won the Loebner Prize for the best human-like computer program (Bradeško & Mladenić, 2012; Wallace, 2009). A.L.I.C.E.'s Knowledge Base was composed of an extensive collection of approximately 41,000 templates and associated patterns, compared to ELIZA's merely 200 keywords and rules (Heller et al., 2005). However, A.L.I.C.E. lacked intelligent attributes, resulting in its inability to produce emotionally charged or attitudinal responses reminiscent of human-like interactions (Adamopoulou & Moussiades, 2020). In 2001, starting with SmarterChild, chatbots were integrated into messenger applications such as America Online (AOL) and Microsoft's MSN (Molnár & Szüts, 2018). At this time, chatbots began interacting with humans in attempts to help with their daily tasks. Over the years, chatbots became popular tools for providing various industrial solutions; for example, customer service chatbots were implemented to support or replace human labor (Sheehan et al., 2020; Wang et al., 2013).
In 2016, chatbots were embedded on the Internet of Things (IoT) (Kar & Haldar, 2016) through virtual personal assistants such as Apple Siri (2010), IBM Watson (2011), Google Now (2012), Microsoft Contana (2014), Google Assistant (2016), and Amazon Alexa (2019). These tools offered both text-based and voice-based conversations and generated fast and meaningful responses. Their designs were modern and incorporated new functionalities, including voice recognition and synthesis, customized interaction, third-party integration, context awareness, and multi-turn capability (Belda-Medina & Calvo-Ferrer, 2022; Caldarini et al., 2022). Nevertheless, some of these tools encountered challenges when it came to accurately interpreting and responding to spoken language, especially in specific linguistic contexts (Adamopoulou & Moussiades, 2020; Soffar, 2019).
In late 2022, ChatGPT (Chat Generative Pre-trained Transformer) has emerged as a prominent chatbot, representing a large language model rooted in Reinforcement Learning from Human Feedback and designed for generating conversational outputs (Dwivedi et al., 2023). It uses deep-learning algorithms and training data to understand and generate human-like text (Dwivedi et al., 2023; Lund & Wang, 2023). Although there are different models of ChatGPT, GPT-4, which currently requires a subscription service, is one of the most advanced iterations, known for its large scale and impressive language capabilities. The technological advancements are significant for both the scientific realm and society at large, as influencing across various industries and disciplines (van Dis et al., 2023).
2.2 Types of conversational AI chatbots
Conversational AI has evolved into more sophisticated entities incorporating advances in NLP, neural networks, and deep learning technologies. They can engage in text-based and/or speech-enabled interactions, accompanied by visuals, virtual gestures, or even haptic-assisted physical gestures. Some have even been trained to perform specific tasks, earning them the designation of intelligent personal assistants (IPAs) (Belda-Medina & Calvo-Ferrer, 2022). Adamopoulou and Moussiades (2020) and Belda-Medina and Calvo-Ferrer (2022) have each presented comprehensive classifications of chatbots. However, in the interest of clarity and comprehensiveness, elements from both approaches have been merged to create a unified classification of conversational AI chatbots as shown in Table 1 and described below.
Knowledge | Service | Goals | Response generation method | Human intervention and autonomy | Development platform permissions | Communication channels |
---|---|---|---|---|---|---|
|
|
|
|
|
|
|
- Open-domain: These versatile agents possess the ability to address user queries across multiple domains or subjects, enabling them to provide answers and engage in conversations on a wide range of topics. Chorus, a crowd-powered conversational assistant, exemplifies this category. Other examples include Guardian, CRQA, and AskWiz (Adamopoulou & Moussiades, 2020).
- Closed-domain: These chatbots are designed to respond exclusively to queries within a specific and predefined knowledge domain, limiting their scope to a particular area of expertise or subject matter.
- Servant: These chatbots primarily offer services based, such as handling bookings and conducting FAQ searches, without assuming a companion role. Initially designed as interpersonal chatbots not meant for a servant role, they have evolved over time to focus on efficiently managing user requests and interactions while avoiding a companion or servile persona. Their primary objective is to facilitate user interactions and deliver practical services, akin to the previously mentioned Q&A-oriented chatbots.
- Personal assistant: These chatbots closely accompany users, residing within their domain, and are often linked to messenger apps.
- Inter-agent: These chatbots facilitate communication and interaction between different chatbot entities, allowing them to exchange information, collaborate, and jointly fulfill tasks or provide services (e.g., Amazon's Alexa and Microsoft's Cortana).
- Informative: These chatbots specialize in retrieving specific information from predetermined sources, offering users accurate and concise responses to queries by drawing on their stored knowledge base (e.g., Guardian).
- Conversational: These chatbots engage in natural dialogue, emulating human interactions (e.g., Cleverbot, Blenderbot). These chatbots aim to provide a conversational experience that feels seamless and human-like, allowing users to communicate with the bot in a way that is similar to how they would interact with another person (Khanna et al., 2015; Shawar & Atwell, 2007).
- Task-based: These chatbots are adept at performing specific functions or tasks, effectively assisting users with practical needs. For example, “RoomBot” efficiently handles room reservations and inquiries for hotels.
- Rule-based: These chatbots operate based on predefined sets of rules and instructions, responding to user inputs according to a predetermined decision tree or logic structure.
- Retrieval-based: These chatbots generate responses by selecting pre-existing responses from a collection of predefined options, typically using algorithms to match user inputs with the closest available responses.
- Generative: Generative chatbots create responses using natural language generation techniques, generating unique and contextually relevant answers rather than selecting from predefined options. ChatGPT-3.5 and GPT-4, which can generate human-like text based on the input they receive, are examples of generative-based chatbots.
- Human-mediated: These chatbots integrate human computation to confer heightened flexibility.
- Autonomous: These chatbots operate without continuous human intervention and are capable of handling tasks and interactions independently. Nevertheless, they may still have limitations or vulnerabilities that require occasional human interaction or intervention for improvement.
- Open-source: These chatbots follow an ethos wherein their code is freely available for modification and contributions by the community. While they can also be used by the public, the key distinction lies in their open and collaborative nature, allowing users to customize and improve upon the chatbot's functionality.
- Commercial: These chatbots are often designed for use by the public at large, serving a broad user base and typically motivated by profit-oriented goals. These chatbots may generate revenue through various means and are accessible to both customers and clients, thus aligning them with the idea of being “public.”
- Text-based: These chatbots are computer programs designed to engage in conversations with users through written or typed text. They operate by processing and generating text responses and are typically used in messaging apps or on websites to provide information or assistance or perform specific tasks.
- Voice-based: Voice-based chatbots interact with users through spoken language. These chatbots use automatic speech recognition (ASR) and NLP to understand and respond to voice commands and inquiries. They are often found in devices like smartphones, smart speakers, and automotive systems.
- Image-based: These chatbots are artificial intelligence systems designed to interact with users through visual content, such as images or pictures, instead of text or voice. These chatbots employ image recognition technology to understand and respond to user queries or commands based on the content of images provided by the user.
- Combinations of text-based, voice-based, or image-based: These combined-function chatbots can interact with users through a variety of communication modalities. They are designed to seamlessly switch between text, voice, and image recognition capabilities, providing users with a versatile and multimodal conversational experience based on their preferences and the context of the interaction.
2.3 User privacy, chatbots, and social informatics
Over the course of conversational chatbot evolution, chatbots have been designed to collect significant amounts of personal information without transparency and user consent. This practice has given rise to user privacy and security harms and risks (Dev, 2022; Ischen et al., 2020; Kelly et al., 2022; van Eeuwen, 2017; Hendrickx et al., 2021; Chaves & Gerosa, 2021). These concerns manifest at various stages, starting before even initiating a conversation with chatbots. These harms and risks can manifest both before initiating a conversation with chatbots, encompassing concerns such as cognitive biases and dark patterns (Alberts & Kleek, 2023), and during the interaction itself (Ischen et al., 2020). The information gathered during these interactions could be susceptible to unauthorized access, data breaches, and misuse, potentially leading to adverse consequences for users. Addressing these concerns within the context of SI not only safeguards individual rights but also lays the foundation for a more resilient and trustworthy technology ecosystem.
SI is an interdisciplinary research field that focuses on (1) the relationships between information and communication technologies (ICTs), (2) the design, implementations, and use of ICTs, and (3) the context of ICTs design and use in which these occurs (Fichman et al., 2015; Kling et al., 1998; Sawyer & Eschenfelder, 2002). SI emphasizes that “ICTs do not exist in social or technological isolation” (Kling et al., 2000, p. 15). SI research is a problem-oriented field—focuses on problems rather than theories or methods used in a research study (Sawyer & Eschenfelder, 2002; Sawyer & Rosenbaum, 2000). It is an empirically focused area that interprets findings based on the social impacts of ICTs (Kling, 2000; Kling et al., 2000; Sawyer & Eschenfelder, 2002; Sawyer & Rosenbaum, 2000). Moreover, SI addresses issues arising from the bidirectional relationship between social factors and ICTs, often connecting specific social levels of analysis to the broader context in which computing occurs (Sawyer & Eschenfelder, 2002). Much like related areas such as human–computer interaction (HCI) and computer-mediated communication (CMC), SI seeks to comprehend the dynamic relationship between society and technology (Herring, 2014; Sawyer & Eschenfelder, 2002). This interdisciplinary field connects specific social levels of analysis to the broader context in which computing takes place. By integrating these diverse perspectives, SI endeavors to offer a comprehensive understanding of how technology influences and is influenced by human behavior. Manole et al. (2016) acknowledged the inclusion of topics such as “privacy and social control through ICT” and “the effects on privacy and individual development” within the realm of SI. Privacy is a pertinent topic in our digital age, as individuals and societies grapple with issues of privacy and the implications of information sharing in the digital realm.
- The context of ICT use directly affects the ICTs meanings and roles.
- ICTs are not value neutral; their use creates winners and losers.
- ICT use leads to multiple, and often paradoxical, effects.
- ICT use has moral and ethical aspects, and these have social consequences.
- ICTs are configurable—they are actually collections of distinct components.
- ICTs follow trajectories and these trajectories often favor the status quo.
- ICTs co-evolve during design/development/use (before and after implementation).
- SI clearly sheds light on user behavior and practices, such as the sharing of personal information with chatbots, which is a key concept in exploring privacy harms and risks.
- SI considers social norms and values with highlighting how varying expectations regarding privacy impact from user–chatbot interactions.
- SI assesses how chatbots collect, store, and use user data, and whether these practices align with social norms and expectations.
- The SI perspective emphasizes the influence of power dynamics, acknowledging that the creators and operators of chatbots can affect user privacy.
- SI recognizes the feedback loop, where user experiences and concerns can influence the development and policies governing chatbot technology.
- SI underscores the importance of legal and ethical frameworks in addressing privacy concerns and determining whether existing regulations and guidelines are adequate.
Kling et al. (2000, 2007) also argued that early examination of new and emerging applications during the diffusion process can impact and shape the development and utilization of novel technologies. Moreover, there are some studies which are evidenced by other privacy studies using a SI lens. For instance, Fusco et al. (2010) used SI framework when addressing privacy, trust, and trust concerns in online social networking. Gamble (2020) also applied SI perspective when exploring the use and concerns of AI and mental healthcare chatbots. Thus, exploration of user privacy harms and risks in conversational text-based AI chatbots and broader investigation of user interactions and decision-making are important to SI.
3 METHODOLOGY
- Define: Selecting the inclusion/exclusion criteria and identifying the appropriate data sources.
- Search: Collecting the papers by searching the identified sources.
- Select: Defining the final sample by checking the papers against the identified inclusion/exclusion criteria.
- Analyze: Analyzing the selected papers through open, axial, and selective coding techniques.
In this study this procedure offers a rigorous approach to identifying and analyzing pertinent themes within the domain of privacy and conversational AI-based chatbots. By adopting a methodological process encompassing data collection, selection, and analysis, this method makes the literature review more comprehensive and complete.
3.1 Data collection
3.1.1 Inclusion and exclusion
- The articles must study interaction with conversational AI chatbots that are exclusively text-based and bidirectional.
- The articles must present at least one study with users in which the technology is studied with reference to the people who interacted with it. This criterion was defined to ensure that the articles did not exclusively focus on system capabilities or performance.
- The articles must focus on user privacy harms and risks that arise from interactions with text-based conversational AI chatbots. The papers should also present findings on how individuals experience such interaction, for example, how users behave, feel, think, and react during the interaction, or how they perceive or evaluate the chatbot and their interaction with it. This criterion guarantees the multidisciplinary of the reviewed aspect.
- The articles must be published in peer-reviewed journals and conferences between 2017 and August 2023 and should be written in English. Additionally, any relevant dissertations and master's thesis publications should also be included.
- The articles that assessed only the effectiveness of chatbot design for privacy (e.g., Berger et al., 2019; Figueroa et al., 2021; Lin et al., 2019) or privacy compliance (e.g., Brüggemeier & Lalone, 2022; Calvaresi et al., 2021, Salehzadeh Niksirat et al., 2023, Xie et al., 2022) with or without exploring user interactions with the chatbot should be excluded.
- The articles that conducted user studies only to assess the effectiveness of an NLP technology or algorithm for privacy (e.g., Shah & Panchal, 2022; Srivastava & Prabhakar, 2019) should be excluded.
- The articles that focused on security rather than on users' privacy (e.g., Edu et al., 2022; Eltahir et al., 2022; Ye & Li, 2020) should be excluded. If the papers explored security attacks and/or vulnerabilities to the system, they should also be excluded.
- The articles that explored voice assistants (e.g., Alexa, Siri, etc.), IoT articles dealing with ECAs, speech technology, or bots that were not able to sustain meaningful conversations (e.g., Liao et al., 2019; Liesenfeld et al., 2021; Lin et al., 2022; Saffarizadeh et al., 2017; Shenava et al., 2023; Stucke & Ezrachi, 2017) should be excluded.
- The articles that were published in the adjunct proceedings of conferences (such as posters, workshop papers, late-breaking result papers) or in books should be excluded.
- The articles that are duplicates should be excluded.
3.1.2 Search strategy
The papers presented in this review were manually collected in June 2023 from Google Scholar, the Association for Computing Machinery Digital Library (ACM DL), Web of Science, and the Institute of Electrical and Electronics Engineers (IEEE Xplore). These repositories were selected in order to potentially capture any article addressing relevant topics in conversational AI chatbots and privacy. The search was conducted by using a number of search statements that combined relevant phrases and terms. The first statement aimed to explore general privacy concerns related to chatbots, while the second statement was specifically for addressing user privacy concerns with chatbot interactions. The search statements were:
Statement 1: (chatbot* OR “AI chatbot” OR “virtual assistant*” OR “conversational agent*”) AND (privacy*).1
Statement 2: (chatbot* “AI chatbot” OR “conversational AI” OR “conversational AI chatbot” OR “virtual assistant*” OR “conversational agent*”) AND (privacy* OR “usable privacy” OR “privacy issues”).2
The results of both rounds of search and papers that were downloaded and included in the subsequent analysis are provided in Table 2.
Statement 1 | Statement 2 | |||||
---|---|---|---|---|---|---|
Sources | Search results | Selected for review search results | Papers downloaded | Search results | Selected for review search results | Papers downloaded |
Google Scholar | 24,600 | 200 | 57 | 145,000 | 200 | 32 |
ACM Digital Library | 22,474 | 200 | 26 | 25,265 | 200 | 11 |
Web of Science | 34 | 34 | 19 | 41 | 41 | 12 |
IEEE | 11 | 11 | 4 | 8 | 8 | 2 |
Total | 47,119 | 445 | 106 | 170,314 | 449 | 57 |
3.1.3 Article selection
The search was limited to title, abstract, and keywords from four sources: journals and conference proceedings as sources and articles and conference papers as documents. The author meticulously assessed each search result by clicking and examining them individually in each source results. Subsequently, the results were exported to a spreadsheet, complete with their corresponding URLs and rearranged into columns with publication details in order to make them uniform, duplicates were deleted, and the papers that clearly did not meet the formal criteria (e.g., workshop papers) were removed. However, upon carefully examining the search results in four sources, it was observed that materials beyond the 198th paper either duplicated content already assessed or were unrelated to the review's scope. As a result, if the search generated over 200 results, the additional results were decided to be disregarded on the assumption of duplication or lack of relevance. If the result count was below 200, all search results were reviewed comprehensively. In total, 894 papers' titles and abstracts were reviewed.
After additional screening for relevance through their abstracts and removing duplicates, the total dataset consisted of 163 papers (Statement 1 = 106 and Statement 2 = 57). The findings and recommendations sections of these remaining papers were examined to identify whether they addressed user privacy harms and risks within the context of conversational text-based chatbot interactions. Any papers that did not address these aspects were subsequently removed from consideration. In total, 61 full-text papers were deemed eligible for further analysis. If ambiguities persisted about the eligibility of a specific paper, the text was analyzed. After analysis, total 23 papers were excluded because they focused primarily on security and design for privacy compliance (n = 11), and they were excluded they were not including at least one user study that explored user privacy impacts from chatbot interactions (n = 12). At this stage, 38 papers were identified for inclusion in the corpus. The qualification rates were Google Scholar (47%), ACM DL (22%), Web of Science (14%), and IEEE Xplore (20%). Figure 1 shows the flow diagram of the database searches and article screenings.

3.2 Data analysis
The papers were read closely, and statements were extracted based on the set of categories developed in advance. The statements that were extracted and coded included how the paper defined AI chatbots, types of AI chatbots studied, its methods and theoretical foundations, research subjects/participants, its privacy findings or recommendations, limitations, future research, and conclusions. A small sample of papers (10%) was coded separately, and the results were then reviewed. The differences were reconciled, and the coding procedure continued. If there was any misleading information, it was removed from the codebook. The final codebook is shown in Table 3.
Code | Description |
---|---|
Publication date | Paper's publication year |
Publication venue | Paper's publication venue name |
Paper | Paper code should be as Author Last Name, First Name Initial. et al. (year) |
AI chatbot is | Describe how AI chatbot is defined or conceptualized in the paper using a brief 3–5-word description |
Paper goal | Describe the main goal of the paper |
Paper methods | Specify methods that were used in this paper (e.g., conceptual analysis, case study, survey, review) |
Study participants | If the paper is empirical (defined as evidential in paper modality), describe study participants (number and major characteristics, could be non-human, e.g., documents), if not—see “actors” in the projects tab |
Findings or recommendations | Describe the recommendations proposed in the paper or summaries of what researchers (or other actors) do about AI chatbot and privacy |
Conclusion | Summarize the main conclusion of the paper |
Limitations | Describe the limitations proposed in the paper |
Future research | Describe future research proposed in the paper |
Comments | Additional thoughts or questions |
Quotes | Insert quotes that are important for the review |
4 RESULTS
In this section, the results that emerged from the analysis of the 38 papers included in the corpus are reported. The main theoretical and methodological approaches used are described.
4.1 Theoretical approaches
Research on privacy and conversational text-based chatbots has employed a diverse range of theories from such disciplines as social science, information technology management, and cognitive science. The theories are divided into subsections that detail social science, technology, and cognitive science theories applied in privacy oriented conversational text-based AI chatbot studies. Table 4 provides an overview of theories that have been used in the research.
Theory category | Theory | Paper |
---|---|---|
Social Science | Agency Theory | Cheng and Jiang (2020) |
Human Action Regulation Theory | Gieselmann and Sassenberg (2022) | |
Brand Relationship Theory | Ischen et al. (2020) | |
Trust-Commitment Theory | Rajaobelina et al. (2021) | |
Conversation Theory | Dev (2022) | |
Communication Privacy Management Theory | Sannon et al. (2020) | |
Humanness-Value-Loyalty Model (HVL Model) | Pizzi et al. (2023) | |
Information Boundary Theory | Fan et al. (2022) | |
Innovation Diffusion Theory | van Eeuwen (2017); Mercieca (2019) | |
Need to Belong (NTB) | Widener and Lim (2020) | |
Privacy Management Theory | Sannon et al. (2020) | |
Privacy/Technology Paradox | Fan et al. (2022); Gieselmann and Sassenberg (2022); Rajaobelina et al. (2021); Ischen et al. (2020) | |
Regulatory Focus Theory | Kim et al. (2022) | |
Social Penetration Theory | Lee et al. (2020), Widener and Lim (2020) | |
Social Presence Theory | Benke et al. (2022); Cheng and Jiang (2020); Lim and Shim (2022); Ng et al. (2020); Sohn et al. (2019); Widener and Lim (2020) | |
Social Response Theory | Ischen et al. (2020); Ng et al. (2020); Prakash et al. (2023); Song et al. (2022) | |
Uses and Gratifications (U&G) Theory | Cheng and Jiang (2020); Mercieca (2019); Rese et al. (2020); Marjerison et al. (2022) | |
Technology | Computers as Social Actors (CASA) Theory | Lee et al. (2020); Lee et al. (2022); Lim and Shim (2022); Ng et al. (2020); Sannon et al. (2020); Song et al. (2022); Widener and Lim (2020); Agnihotri and Bhattacharya (2023) |
Privacy Calculus | Kim et al. (2022); Lim and Shim (2022) | |
Technology Acceptance Model (TAM) | van Eeuwen (2017); Kelly et al. (2022); Lim and Shim (2022); Mercieca (2019); Prakash et al. (2023); Rese et al. (2020); Rodríguez Cardona et al. (2021) | |
Cognitive Science | Theory of Mind | Pizzi et al. (2023) |
4.1.1 Social science theories
Notably, social science theories have played a significant role on the intricate dynamics between users, technology, and privacy outcomes. In this review, 17 distinct social science theories have been identified. This section delves into this nuanced landscape of social science theories, offering insight into their application in studies concerning conversational text-based AI chatbots.
Social Presence Theory (SPT)
Ng et al. (2020) considered social presence as the perception and experience of interacting with a human being.In this context, the experience of social presence can be assumed to be more important for the self-disclosure of sensitive and emotional information, such as feelings that are more intimate, than for the disclosure of factual information such as names (Taddicken, 2014). (p. 2)
Social Response Theory (SRT)
SRT suggests that people react in a similar way to technology that seems human-like as they do to other humans (Nass & Moon, 2000). Reeves and Nass (2000) argued that a social response to a computer system either automatic or natural is not different from human interactions. Ischen et al. (2020) utilized SRT as the foundation for their research on privacy concerns in chatbot–user interactions. Their primary aim was to explore the extent to which these privacy concerns were associated with users' attitudes and their adherence to recommendations. Drawing from SRT, Ischen et al. (2020) built on this framework to examine anthropomorphism and privacy concerns as sequential underlying mechanisms that could explain the outcomes observed in the study. Ng et al. (2020) also aligned with SRT in suggesting that individuals use social norms as a guide when interacting with computers. Drawing upon SRT, Song et al. (2022) specifically focused on the impact of perceived privacy risk during interactions with human and human–computer service agents by examining perceived differences in communication quality and privacy risks, and investigating the potential influence of users' human interaction needs/desire such as personalized information (Ashfaq et al., 2020) on these perceptions. Prakash et al. (2023) also applied SRT to identify the factors influencing consumers' trust in text-based customer service chatbots and examined the impact of trust on users' behavioral intentions which may affect user to disclose personal information to these chatbots. Prakash et al. (2023) applied SRT to explore how the sense of social presence (Noor et al., 2022; Qiu & Benbasat, 2009; Vitezić & Perić, 2021) affects trust in customer service chatbots. SRT helped Prakash et al. (2023) find that individuals had a natural inclination to perceive computers as social actors, even when aware of the machine's lack of feelings or motives (Nass & Moon, 2000).
Social Penetration Theory (SPT2)
Altman and Taylor (1973) define SPT2 as the bonding process from a superficial relationship to more engaged and intimate relationship. SPT2 depends on self-disclosure which is a key factor of engagement. Altman and Taylor (1973) categorized self-disclosure into four stages: orientation, exploratory, affect-exchange, and stable-exchange. SPT2 also involved two dimensions for measuring self-disclosure: disclosure breadth, reflecting the extent of self-relevant statements made in an interaction, and disclosure depth, indicating the level of intimacy in the information shared (Altman & Taylor, 1973). Widener and Lim (2020) explored the factors that prompt individuals to engage in self-disclosure during interactions with AI chatbots and the outcomes of such interactions. They (2020) considered the self-disclosure as a significant element of SPT2 which underpins their exploration of communication dynamics in the context of chatbot interactions and the depth of personal disclosure. Lee et al. (2020) applied SPT2 to investigate the potential of chatbots in facilitating deep self-disclosure. They argued that SPT2 has been a valuable framework for conceptualizing the range of self-disclosure to others and the subsequent bonding that may occur between humans and chatbots.
Uses and Gratifications (U&G) Theory
U&G theory is a mass communication paradigm used to understand why people use media and the gratifications they obtain through proactive media consumption (Katz et al., 1973). Cheng and Jiang (2020) expanded the application of U&G and categorized the obtained gratifications into four main clusters based on the users' experiences: hedonic, technological, social, and utilitarian gratifications. Cheng and Jiang (2020) implemented the theory in the domain of customer service by creating a theoretical framework that helps to explain the effects of smart media and the overall influence of AI chatbots on business outcomes. U&G theory also enhanced the perceived privacy risk with chatbots and its impact on user satisfaction (Cheng & Jiang, 2020). They combined SPT with U&G theory to understand user satisfaction and perceived privacy risks of chatbots. In Marjerison et al.'s (2022) study on ecommerce chatbots, U&G Theory was also adopted to analyze users' acceptance of these chatbots with exploring technology (chatbot design and interaction) acceptance (Wang et al., 2020), hedonism, and privacy risks. Mercieca (2019) conducted research exploring the factors influencing acceptance, trust, and confidence in chatbots and applied U&G theory in their study. Correlating their findings with U&G theory, Mercieca (2019) suggested that the benefits (gratifications) of incorporating AI chatbots may be a contributing factor to their acceptance and popularity across various domains (Brandtzaeg & Følstad, 2017). In other words, to ensure the success of chatbot-based services, it is essential to prioritize usefulness and ensure that chatbots fulfill specific needs and gratify users with valuable functionalities (Brandtzaeg & Følstad, 2017; Mercieca, 2019). While Rese et al. (2020) were identifying factors that positively or negatively influenced the usage intention and frequency of use of an AI chatbot called “Emma,” they applied U&G theory (Davis et al., 1989, p. 983). U&G theory allowed them to categorize motivations for using text-based chatbots, building on previous work by Brandtzaeg and Følstad (2017). Considering potential hindrances to chatbot acceptance, Rese et al. (2020) addressed two risks which is: immaturity of the technology and privacy concerns. They also argued that U&G theory can provide more information when evaluating with their U&G model of the chatbot “Emma” (Figure 2).

Innovation Diffusion Theory
Both van Eeuwen (2017) and Mercieca (2019) applied Innovation Diffusion Theory (Rogers, 2003), which explains how and at what rate technologies are diffused and adopted by users. They adopted the theory to describe the factors influencing users' trust regarding accepting and adopting the chatbots.
Conversation Theory
This theory explains social systems are formed when individuals share a common interpretation of language (Pask, 1976). In the context of chatbots, this means that a social system can be established if the chatbot effectively engages humans in conversation, fostering shared interpretation and understanding. Dev (2022) utilized this theory to examine chatbot interactions and their influence on social systems, emphasizing the importance of integrating interpersonal communication elements into computer systems, especially the chatbots' capacity to facilitate meaningful interactions with humans.
Agency Theory
Loyalty incentives are suggested to have a substantial impact on the continued intention of using electronic commerce services (Bhattacherjee, 2001). Building on this theory, Cheng and Jiang (2020) explored whether customer loyalty toward a brand would positively predict their intentions to continue using its chatbot services.
Need-to-Belong (NTB) Theory
NTB refers to the inherent and pervasive motivation in human beings to form and maintain lasting interpersonal connections (Baumeister & Leary, 1995). The level of NTB in users was believed to influence the experience and psychological consequences arising from individuals' interactions with AI. Widener and Lim (2020) discussed how NTB Theory influenced individuals' interactions with an AI agent, shedding light on how this fundamental desire for interpersonal attachments played a role in their engagement with the AI.
Human Action Regulation Theory
It defines the different levels of action regulation in human work actions were embedded in four required levels: sensorimotor, flexible action patterns, intellectual, and meta-cognitive heuristics (Zacher & Frese, 2018). Sensorimotor refers actions are not associated with independent goals when regulation processes occur. In flexible action patterns, action regulations are based on human process information and environment. Intellectual level requires focus because it is a new and complex action when the decision is needed to be made. According to Gieselmann and Sassenberg (2022), these levels of approaches were directly affecting self-disclosure toward conversational AI. Their study proposed a framework applying this theory to refine the articulation of conversational AI proficiencies to align with the facets of human capabilities which affects human interaction in chatbots (Gieselmann & Sassenberg, 2022).
Brand Relationship Theory
This theory is a conceptual framework that explores and analyzes the interactions and connections between consumers and brands beyond mere transactional exchanges (Fournier, 1998). It posits that consumers can develop and maintain relationships with brands similar to the way they build interpersonal relationships with other individuals (Ischen et al., 2020). This theory emphasizes the idea that consumers can form deep and meaningful connections with brands, and these relationships can significantly influence their purchasing decisions, brand loyalty, and overall brand perception. Ischen et al. (2020) employed this theory in their study, and they argued that this theory was directly affecting user self-disclosure when interacting with chatbots.
Humanness-Value-Loyalty Theoretical Framework (HVL Model)
The HVL Model is a social categorization theory (Fiske et al., 2007). This model serves as a conceptual framework for understanding how users categorize chatbots in terms of their perceived humanness and how these categorizations influence their future intentions and behaviors (Pizzi et al., 2023). It acknowledges that the degree of humanness attributed to chatbots can impact users' assessments of the value they derive from the service and their loyalty to the service provider (Pizzi et al., 2023). The model posits that individuals categorize chatbots based on their perceived humanness, specifically in terms of attributes like anthropomorphism and gaze direction (Bailenson, Beall, & Blascovich, 2002), and these categorizations can significantly impact their future intentions and behaviors toward the service provider offering a chatbot-mediated interface (Pizzi et al., 2023). In Pizzi et al. (2023), the model was employed to investigate the influence of customers' perceptions of chatbots' humanness on their future intentions. The study examined how attributes such as anthropomorphism (the degree to which a chatbot resembles a human) and gaze direction (the direction in which the chatbot “looks”) affect users' willingness to disclose personal information to a chatbot. These findings align with the core concepts of value and loyalty within the original model (Chai et al., 2015; Cloarec et al., 2022; Malhotra et al., 2004).
Regulatory Focus (RF) Theory
Kim et al. (2022) incorporated RF theory to examine the relationship between consumers' motivations and their actions in the context of chatbot advertising. Kim et al. (2022) introduced RF as a crucial concept to reconcile inconsistent findings on the disclosure of personal information (see Figure 3). They defined this theory as the alignment of individuals with either prevention or promotion self-regulatory orientations (Higgins, 1997) in various daily activities like driving, investing, and shopping (Kim et al., 2022). In other words, according to the theory, individuals can be driven by promotion-oriented goals, which involve seeking advancement and success to achieve an ideal self and gain over potential risks or losses (Kim et al., 2022). Kim et al. (2022) found that consumers with a predominantly promotion-focused mindset were more receptive to personalized chatbot advertising, valuing the benefits offered and displaying a higher likelihood of making purchases.

Communication Privacy Management (CPM) Theory
CPM theory posits that individuals have well-defined expectations about privacy boundaries and rules for disclosing information to third parties during interactions with other people (Petronio, 2013). The theory emphasizes that individuals see themselves as owners of their private information and employ a personalized set of privacy rules to control access to that information. When individuals share private information with interaction partners, these partners become co-owners of the information and were supposed to abide by mutually agreed-upon or assumed privacy rules (Sannon et al., 2020). If these rules are breached, such as when a co-owner shares the information with an unauthorized third party, the original owner perceives it as a privacy violation (Sannon et al., 2020). Sannon et al. (2020) extended CPM theory to the context of human–agent interactions, aiming to understand people's expectations and mental models concerning how agents handle data sharing practices and potential privacy violations.
Trust-Commitment Theory
Rajaobelina et al. (2021) considered trust plays a significant role in human–chatbot interaction. They (2021) applied trust-commitment theory in their study which argues the strong relationship between trust and loyalty (Morgan & Hunt, 1994). While loyalty is high commitment, trust is an antecedent of commitment (Morgan & Hunt, 1994).
Information Boundary Theory
This theory suggests that each customer possesses an informational space with distinct boundaries that they strive to manage and control (Fan et al., 2022). Any attempt to breach the boundary between general and personal information (e.g., a marketer seeking to collect a consumer profile) may lead the customer to feel uneasy and dissatisfied (Fan et al., 2022). In this theory, the dilemma arises from the customer's desire to receive personalized service while safeguarding their personal information, creating a paradoxical situation (Fan et al., 2022; Karwatzki et al., 2017). Fan et al. (2022) utilized Information Boundary Theory to examine the potential role of chatbot service in addressing customers' personalization–privacy paradoxes.
Privacy/Technology Paradox
The paradox can be defined as the complex relationship between individuals' concerns about privacy and their willingness to share personal information in the context of technological interactions. It reflects the apparent contradiction where people express privacy concerns but still engage in behaviors that involve sharing personal data due to perceived benefits or convenience. Several studies have employed this theory to assess the impact of personalization on privacy concerns during human–chatbot interactions (Fan et al., 2022; Gieselmann & Sassenberg, 2022; Ischen et al., 2020; Rajaobelina et al., 2021). These studies have contended that this paradox significantly influences users' utilization of chatbots. Gieselmann and Sassenberg (2022) and Ischen et al. (2020) have also investigated the privacy paradox, suggesting that heightened privacy concerns do not necessarily deter users from disclosing personal information. Both utilized measures of user self-disclosure in the context of the privacy paradox. Rajaobelina et al. (2021) also employed this paradox to explore the antecedents of creepiness in such interactions.
4.1.2 Technology theories
There are several technology approaches that have guided conceptualization of conversational text-based AI chatbots regarding privacy outcomes. These approaches guided scholars in developing the following theories: Computers as Social Actors (CASA) Theory, Technology Acceptance Model (TAM), Privacy Calculus, and Gaze Direction Theory. Within this section, the theories are expounded upon to foster a thorough comprehension of their application and impact within this field.
CASA Theory
Lee et al. (2020) considered the CASA theory as the basis for exploring how people could form stronger relationships with a chatbot if it incorporates a self-disclosing feature. According to the theory, individuals tend to unconsciously apply social norms and expectations that are typically associated with humanlike features or human–human relationships, such as empathy, when interacting with computer agents (Lee et al., 2005; Lee et al., 2022; Nass et al., 1994). Social presence is one significant factor of the CASA theory since users ascribe social cues and conventions to machines, shaping their perception of social presence in human–computer interaction (Lee et al., 2022; Lee & Nass, 2003). Agnihotri and Bhattacharya (2023) argued this theory can “make consumers perceive them as having more human-like attributes would enhance chatbots' trustworthiness” (p. 2). Ng et al. (2020) viewed the CASA as a relevant perspective in understanding the endeavor to incorporate immersive qualities and humanlike behavioral richness in chatbots. Lim and Shim (2022) adopted the CASA theory as a guiding perspective to examine privacy concerns in the context of AI agent use, specifically in the use of chatbots. They (2022) explored the psychological aspects of user responses toward chatbots, incorporating three key variables—intimacy, para–social interaction (PSI), and social presence—to understand the psychological relationship between users and chatbots. These variables served as indicators of the extent to which users perceive the chatbot as a human-like partner and the potential that these perceptions might lead to increased personal information disclosure and heightened privacy concerns. Furthermore, CASA theory asserts that people tend to respond to technologies in a manner similar to their interactions with human partners, utilizing familiar social scripts and rules (Sannon et al., 2020). This response is triggered even when technologies possess minimal social cues or human-like characteristics. Sannon et al. (2020) proposed that these responses are likely to extend to privacy concerns, as the conditions that influence people's trust in computers and computer agents mirror trust-building in interpersonal communication. Considering that individuals often treat human–agent interactions much like human–human interactions, interpersonal theories offered valuable insights into understanding how people would react to privacy issues during social interactions with agents, particularly when these agents exhibit human-like characteristics (Sannon et al., 2020). Song et al. (2022) applied the CASA theory to investigate the role of these five dimensions in shaping consumers' perceived quality of communication during interactions with chatbots, thereby providing valuable insights into human–computer communication dynamics. Widener and Lim (2020) emphasized that human–computer relationships are governed by many of the same social rules that characterize interpersonal interactions, in line with the CASA theory. They (2020) adopted the CASA framework as their overarching theoretical perspective, exploring how humans tend to mindlessly apply the same social heuristics used in human interactions to computers, supported by empirical studies (Nass & Moon, 2000; Reeves & Nass, 1996; Sundar & Nass, 2000). The presence of anthropomorphic cues on a computer interface triggers mindless responses, described as the social presence heuristic (Widener & Lim, 2020). Additionally, agency affordance and interactivity affordance influence information processing through the social presence heuristic and contribute to users' psychological empowerment through active engagement with the interface and content (Widener & Lim, 2020). Regarding the effect of perceived humanness, according to Widener and Lim (2020), users' perception of the chatbot being operated by AI or a human does not significantly impact self-disclosure, social presence, or intimacy. This finding aligns with the CASA theory and suggests that users do not clearly distinguish between their computer counterpart and a human counterpart in certain contexts.
Technology Acceptance Model
Lim and Shim (2022) revealed that individuals perceive that using AI agents could improve their everyday task performance, aligning with the concept of perceived usefulness (PU) from the TAM proposed by Davis et al. (1989) and Davis et al. (1989) (see Figure 4).

In Mercieca's study (Mercieca, 2019), the TAM was employed to explore the acceptance and adoption of chatbots. The perceived ease of use (PEOU) and perceived usefulness (PU) of the chatbot technology within its application area, along with cultural norms and social influence, play pivotal roles in predicting its acceptance (Mercieca, 2019). Mercieca (2019) found that the PEOU was facilitated by two factors: firstly, most participants had prior exposure to chatbots, making them more receptive to their ease-of-use; and secondly, chatbots inherited features, such as keyboard and chat dialogue, that were familiar to participants from other utilities, contributing to their ease-of-use. Mercieca (2019) revealed that technical malfunctions and inconsistencies acted as barriers to trust in chatbots, impacting their acceptance and adoption by users. In addition, Kelly et al. (2022) applied the TAM and its extended constructs (PU, PEOU, and trust) to investigate individuals' behavioral intentions in the areas of user acceptance and adoption of AI chatbots in mental healthcare, online shopping, and online banking. TAM was chosen due to its adaptability to new variables and its suitability for analyzing human behavioral intentions toward technology (Kelly et al., 2022). Their proposed TAM (Figure 5) posits that PU and PEOU are the primary factors influencing an individual's behavioral intention to use a technology, subsequently impacting actual system usage (Kelly et al., 2022). They Widener and Lim (2022) considered trust as an additional predictor because it was found to influence users' intentions toward technology, so it was added in as one of TAM's constructs. Prakash et al. (2023) also explored the connection between TAM's attributes of PU, PEOU, and trust in AI based chatbots and additionally included social presence and privacy risk in their conceptual framework (see Figure 6).


Rodríguez Cardona et al. (2021) focused on experienced chatbot users and examined the impact of privacy concerns and trust in chatbot systems, alongside the widely used TAM variables of PU and PEOU. Rese et al. (2020) employed the TAM to measure the acceptance of the chatbot “Emma” among its target segments, with included perceived enjoyment as an additional consideration (van der Heijden, 2004) and adopted the previous work (Rietz et al., 2019). Rese et al. (2020) considered PU as an extrinsic motivation, where improved performance is expected as a consequence of using the chatbot to achieve utilitarian gratifications such as authenticity of conversation, PU, or hedonic factors/positive beliefs such as perceived enjoyment. These two approaches, utilitarian and hedonic, to find usage intention and frequency in users' behavior in chatbots. PEOU was seen as a determinant or antecedent, positively affecting both perceived usefulness and perceived enjoyment, leading to hedonic gratification. Their proposed TAM relationships for “Emma” can be seen in Figure 7.

Privacy Calculus Theory
Kim et al. (2022) considered and applied privacy calculus theory (Dinev & Hart, 2006) in the context of chatbot advertising and its effectiveness. The theory posited that varying levels of privacy concerns determined individuals' degree of responsiveness and reaction to personalized chatbot encounters (Kim et al., 2022). Kim et al. (2022) suggested that consumers are likely to apply privacy calculus only when chatbot advertising delivers highly distinctive knowledge about their personal characteristics. Furthermore, privacy calculus theory helped to answer why consumers show differing responses to personalized chatbot advertising based on their perceptions of risks and benefits. The theory, in conjunction with regulatory focus and privacy concerns, also provided insights into understanding consumers' reactions to personalized chatbot advertising, shedding light on how individuals perform risk–benefit analyses and make decisions regarding their data privacy in the context of ad personalization (Kim et al., 2022).
4.1.3 Cognitive science theories
By delving into the realms of cognitive science, researchers and practitioners can unravel the intricate connections between human cognition, decision-making, and the safeguarding of personal information. In this review, there is only one distinguish cognitive science theory has been identified: Theory of Mind.
Theory of Mind
A cognitive science theory, theory of mind, refers to an individual's capacity to understand the intentions of others and acknowledge that others possess distinct minds with their own intentions, preferences, and attitudes (Minton et al., 2021). This theory holds significance in marketing as recent research has demonstrated its critical role in understanding empathy, attribution decisions, responses to social risk, and socially desirable response patterns (Pizzi et al., 2023). Pizzi et al. (2023) explored this theory as a moderator in the relationship between consumers' perceptions of warmth and competence and their skepticism regarding privacy issues. Their study focused on how consumers' theory of mind influences their perception of chatbots, and the gaze direction employed by these virtual entities. Their findings (2023) revealed that individuals with a high theory of mind are more adept at identifying persuasion techniques and are therefore more likely to exhibit higher levels of skepticism when it comes to safeguarding their personal information. This heightened ability to affect the intentions of interactive systems, such as chatbots, influences their decision-making processes during these interactions and may ultimately impact their trust in the user interface.
4.2 Methodological approaches
This section reviews the different methodologies utilized to investigate privacy concerns in conversational chatbots, often in combination with specific approaches. The dominant methodologies in this context have been quantitative and mixed-method data collection and analysis. Of the 38 studies, 5 were qualitative studies with open-ended questions and semi-structured interviews. Sixteen studies were exclusively quantitative using respondent level data from the survey. Seventeen were mixed method including both qualitative and quantitative approaches. Most of the user studies (n = 25) in the corpus had significant number of participants, more than 200 participants. The research with the least participation (n = 8) is conducted by Dev and Dev (2023). Table 5 presents this in more details. In this section, the studies' methodologies are reported into three subsections based on their methodology approaches: quantitative approaches, qualitative approaches, and mixed-method approaches. Within this section, the methods employed in the analysis of collected data and their relation to privacy are reviewed.
Participants | Qualitative studies | Quantitative studies | Mixed-method studies |
---|---|---|---|
>0, ≤10 | Dev and Dev (2023) | 0 | 0 |
>10, ≤60 | Dev (2022); Tlili et al. (2023); Völkel et al. (2020) | 0 | Balaji (2019); Griffin et al. (2021); Lee et al. (2020); Lim and Shim (2022); Mercieca (2019) |
>60, ≤100 | 0 | Sohn et al. (2019) | Bouhia et al. (2022) |
>100, ≤200 | 0 | Benke et al. (2022); van Eeuwen (2017); Song et al. (2022) | 0 |
>200 | Ng et al. (2020) | Belen Saglam et al. (2021); Cheng and Jiang (2020); de Cosmo et al. (2021); Fan et al. (2021); Fan et al. (2022); Kelly et al. (2022); Lee et al. (2022); Pizzi et al. (2023); Prakash et al. (2023); Rajaobelina et al. (2021); Rodríguez Cardona et al. (2021); Marjerison et al. (2022) | Biswas (2020); Boucher et al. (2022); Belen-Saglam et al. (2022); Gieselmann and Sassenberg (2022); Ischen et al. (2020); Kim et al. (2022); Lappeman et al. (2023); Rese et al. (2020); Sannon et al. (2020); Widener and Lim (2020); Agnihotri and Bhattacharya (2023) |
4.2.1 Quantitative method approaches
Belen Saglam et al. (2021) conducted an empirical user study, using surveys to explore potential risks and concerns users may have while interacting with AI-based chatbots with 491 participants. The study employed a 7-point Likert scale to measure users' feelings about various aspects of chatbot-centered questions. The researchers used a combination of descriptive and quantitative statistical analysis to analyze the survey responses. Cheng and Jiang (2020) designed an online survey using a 7-point Likert scale for data collection. They recruited 1064 participants by Amazon Mechanical Turk (MTurk). To perform the data analysis, they utilized Structural Equation Modeling (SEM), which is a statistical method that enables researchers to assess complex relationships among variables and validate theoretical models with Mplus software. Pizzi et al. (2023) also conducted two experiments using online survey. First, they employed a between-subject experimental design with factors including gaze direction (direct vs. averted), anthropomorphism (low vs. high), and chatbot gender (male vs. female) with 451 participants. Second, they used a between-subject experimental design with the same factors as the first experiment with 800 participants. They analyzed the data with ANOVA. de Cosmo et al. (2021) employed surveys with designing a questionnaire using 5-point Likert scales to collect data from 846 participants. For analysis, they (2021) performed several statistical procedures: (1) assessed the multivariate normality of the data using Mardia's test; (2) evaluated the internal consistency of the measurement scales by calculating Cronbach's α coefficient, which ensured that the measurement scales were reliable and consistent in assessing the underlying constructs, (3) derived unidimensional values for each scale by averaging the item responses; and (4) conducted a moderated mediation analysis using the PROCESS macro in SPSS. van Eeuwen (2017) employed a quantitative cross-sectional research design with formulating hypotheses and conducting an online survey. van Eeuwen (2017) asked 195 participants to rate their agreement with statements on a 5-point Likert scale, reflecting factors, such as perceived usefulness, internet privacy concerns, and trust, influencing their intention to use chatbots. The data was analyzed with simple regression analysis to examine the relationships between independent variables and the dependent variable (IPC), that is, the intention to use chatbots for mobile commerce (van Eeuwen, 2017). In order to understand Chinese people's acceptance of chatbots, Marjerison et al. (2022) used quantitative methodology and collected data through an anonymous online survey from 540 participants who identified themselves as online shoppers, Chinese, and were already familiar with chatbots. In 7-point Likert scale survey, they tested their 6 hypotheses on three dimensions: technology acceptance of chatbots, hedonic (enjoyment, passing time, and behavior intention), and users' privacy concerns and risks.
Fan et al. (2021) adopted a data-driven approach by analyzing the system log of a widely deployed self-diagnosis DoctorBot chatbot in China. The dataset comprised 47,684 consultation sessions initiated by 16,519 users. To comprehensively analyze their dataset, they employed both statistical analysis and content analysis. Kelly et al. (2022) conducted a cross-sectional within-subjects research design survey, consisting of a 71-item online questionnaire, to evaluate 360 participants' future behavioral intentions toward AI chatbot usage. Privacy concerns and trust were measured using a 5-point Likert scale, and a 7-point Likert scale was adapted to assess privacy concerns across each scenario. For data analysis, statistical analysis was employed, and for the open-ended questions, deductive content analysis was used to identify common themes such as concerns about loss of humanity, job loss, privacy, and inadequate skill concerns. Benke et al. (2022) used an online experiment with three experimental conditions (a between-participants design). Participants were assigned to a prescripted, fictitious group chat with an emotion-aware chatbot that provided feedback on team members' emotions, and this assignment was random. To assess the participants' responses (n = 176), their study employed survey questions rated on 7-point Likert scales. For analysis, they (2022) utilized a statistical method by using ANOVA. Prakash et al. (2023) conducted a user study where participants (n = 221) responded to a questionnaire, rating all items on a 5-point Likert scale. They (2023) empirically tested their proposed conceptual model using the data collected through the questionnaire. To analyze the data and assess the relationships between the variables in the model, they (2023) employed a partial least squares structural equation modeling (PLS-SEM) method, using SmartPLS v 3.3 software. Rodríguez Cardona et al. (2021) conducted a user study with 215 participants using a standardized online cross-sectional survey to explore the relationships underlying the intention to use chatbots in the insurance business. For data analysis, they employed PLS-SEM. The survey questionnaire consisted of closed questions with 22 measurement items related to trust, privacy concerns, perceived ease of use, perceived usefulness, and intention to use. Trust, privacy concerns, and acceptance constructs were measured using a 5-point Likert scale.
Rajaobelina et al. (2021) conducted a standardized online cross-sectional survey, including 430 participants. The survey questionnaire consisted of closed questions related to 22 measurement items, covering trust, privacy concerns, perceived ease of use, perceived usefulness, and intention to use. Both privacy concerns and acceptance constructs were measured using a 5-point Likert scale. Sohn et al. (2019) tested hypotheses with two different experimental studies combined with a survey which measuring both social presence effect on user privacy and presence of CUI effect perceptions of being watched that with 74 participants. Sohn et al. (2019) analyzed the data with ANOVA. Song et al. (2022) also involved two experimental studies (first study participant n = 156 and second study participant n = 157 participants), including 13 hypotheses. They (2022) developed these hypotheses and tested designed group experiments using a single factor (service agent type: chatbot vs. human beings). Participants were surveyed using a 5-point Likert scale. ANOVA was employed for the analysis.
4.2.2 Qualitative method approaches
Ng et al. (2020) measured trust, privacy concerns, social presence, and intention to use chatbots through an online experiment using a secure and reliable chatbot, XRO23 with 191 participants. In the experimental group, participants were then presented with a vignette describing a more human-like chatbot named Emma, while the control group received the reliable and secure chatbot. The vignette-style methodology was used as an induction protocol to assess 219 participants' responses and preferences, providing valuable insights into users' attitudes and acceptance of chatbot features and privacy-friendly interactions. Völkel et al. (2020) also employed a chatbot Kai in their study. The study involved participants (n = 21) interacting twice with the chatbot: first, engaging in customer service scenarios to receive their actual profile (baseline), and then a second time attempting to disguise their personality strategically to trick the chatbot into calculating a falsified profile. They (2022) conducted interviews, questionnaires, and thematic analysis to gain deep, qualitative insights into participants' assumptions about the factors the chatbot used to calculate their profile and the strategies that could be effective in tricking the system.
Dev (2022) explored privacy perceptions through chatbot-human text and thematic analysis with 37 chatlogs on a conversational chatbot, Cleverbot. Dev and Dev (2023) conducted semi-structured interviews with eight participants with employing thematic analysis to gather insights into users' perspectives and motivations, contributing to the development of privacy-friendly chatbots and enhancing users' trust and engagement with the technology in retail settings. Tlili et al. (2023) employed a qualitative instrumental case study with 21 participants to investigate ChatGPT in education. The study included three stages: (1) social network analysis of tweets showed positive enthusiasm and cautious attitudes toward using ChatGPT in education; (2) examination of ChatGPT's impact on education, including response quality, usefulness, personality, emotion, and ethics; and (3) investigation of user experiences through 10 educational scenarios, revealing issues like cheating, honesty, privacy concerns, misleading information, and manipulation.
4.2.3 Mixed method approaches
Biswas (2020) conduced mixed method with two cases: sentiment analysis and open-ended queries. To validate the effectiveness of these privacy preserving methods with 5000 participants, sentiment analysis was used. Bouhia et al. (2022) conducted a user study among 95 online panelists to investigate this issue and employed a specific methodology for their research. They (2022) utilized the driver concept, which involved four variables, and applied them to users interacting with a simulated chatbot. Participants engaged with the chatbot, and afterward, they were given a follow-up survey. The survey was designed using a 7-point Likert scale, which allowed participants to express their attitudes and feelings on various aspects related to their privacy concerns while using the chatbot. To measure privacy concerns in chatbots, the researchers employed four variables: privacy concerns, creepiness, familiarity with chatbots, and perceived risks. These variables likely represented different aspects of privacy that the study aimed to assess. For data analysis, they (2022) used a structural model and hypothesis testing. This analysis allowed them to examine the relationships between the identified variables and how they contribute to users' privacy concerns while interacting with the chatbot. Moreover, Boucher et al. (2021) both surveyed 203 participants and explored their interactions with a mental health care AI, Anna in order to measure users' potential challenges of chatbots including personal data usage and trust.
Gieselmann and Sassenberg (2022) employed two correlation studies and three experiments (consisting of five online case studies). They (2022) asked participants to interact with chatbots and they surveyed them based on their experiences based on a 7-point Likert scale. By conducting both correlation studies and experiments, they (2022) examined the relationships between different competencies and disclosure tendencies in both users and non-users of conversational AI. Additionally, they performed meta-analyses. Ischen et al. (2020) conducted an experimental design with three conditions (human-like chatbot, machine-like chatbot, and website), randomly assigning participants to each group. They (2020) conducted multiple mediation analyses to analyze the data and explore the complex relationships between privacy concerns, users' attitudes, and their comfort levels in sharing personal information with different types of entities in chatbot interactions.
Lee et al. (2020) designed, implemented, and evaluated a chatbot with self-disclosure features for small talk interactions using Manychat and Google Dialogflow. They (2020) conducted a study with 47 participants divided into three groups, each group using a different chatting style of the chatbot for 3 weeks. They (2020) employed surveys, face-to-face semi-structured interviews, and LIWC2015 to calculate the word length and self-disclosure level in the chats. Mixed-model and thematic content analysis were used to examine the data via ANOVA. Lee et al. (2022) designed and implemented a text-based AI psychotherapy chatbot that leverages rapport-facilitating dialogue to create a mutually stable and favorable relationship with users, encouraging them to share personal information. Then, they (2022) conducted an online survey using a 5-point Likert scale to assess users' perceptions and experiences with the chatbot. Balaji (2019) developed a valid and reliable measurement tool for assessing user satisfaction with text-based information chatbots, useful for both businesses and researchers to evaluate interaction quality efficiently. A questionnaire was created based on the feedback and rated on a 5-point Likert scale. Balaji (2019) performed confirmatory factor analysis, parallel analysis, exploratory factor analysis, and reliability analysis to validate and ensure the tool's ability to evaluate user satisfaction effectively and concisely with text-based information chatbots. Belen-Saglam et al. (2022) utilized a 6-point Likert scale online questionnaire administered. They (2022) employed a mixed-method approach to gain insights into users' perceptions and concerns regarding sensitivity in the context of their study. Sannon et al. (2020) conducted a 3 (social interactivity) × 3 (data sharing practices) factorial design experiment, manipulating the agents' social interactivity and data sharing practices to understand their influence on participants' judgments regarding potential privacy violations and their evaluations of the agents. They (2020) employed multiple survey scale systems were utilized (7, 9, and 12-point Likert scales) to assess various aspects including behavioral intentions and manipulation checks.
Griffin et al. (2021) conducted semi-structured interviews and applied descriptive statistics for the quantitative approach. Kim et al. (2022) employed a between-subjects design with two factors: personalization of chatbot ad (high vs. low) and regulatory focus (promotion vs. prevention). After participants interacted with the simulations, they completed a post-experiment survey using a 7-point Likert scale. Agnihotri and Bhattacharya (2023) employed a mixed-methodology approach to test multiple hypotheses about human–chatbot interaction. They surveyed around 500 UK consumers, conducted semi-structured interviews with 18 UK consumers, and used a 7-Point Likert scale in their survey. Survey findings, including privacy concerns, were tested with a structural equation model. Interview analysis was conducted using qualitative data analysis. Lappeman et al. (2023) employed a conclusive, pre-experimental, two-group, one-shot case study design with non-probability snowball sampling and informal sampling. They (2023) used 7-point Likert scales for measuring reliability and validity of factors including privacy concerns, cognitive trust (competence and integrity), emotional trust, brand trust, and user self-disclosure. Data analysis involved structural equation modeling (SEM) and covariance-based structural equation modeling to investigate the relationships between variables and understand user trust and privacy concerns in chatbot-driven digital banking services. Lim and Shim (2022) used a survey with focus group interviews to explore AI agent motivations. They applied structured interviewing techniques and questionnaire construction following Weller's (1998) guidelines. Privacy concerns were assessed using a 7-point Likert scale.
Mercieca (2019) combined interviews, observations, and surveys to gather data. Thematic analysis was used to analyze the qualitative data, and a human–computer trust survey analysis was conducted. Rese et al. (2020) investigated the factors impacting user engagement with the “Emma” chatbot. Participants interacted with the chatbot for a shopping task, followed by a questionnaire using a 7-point Likert scale to evaluate their experience. They (2020) analyzed the data through factor analysis to identify the factors influencing user acceptance and intention to use the chatbot. Widener and Lim (2020) used an online experimental design, varying perceived humanness through the “Wizard of Oz” technique. Participants were randomly assigned to either an “AI chatbot” or a “human,” unaware that both were controlled by a confederate. After the interaction, participants completed a questionnaire with a 7-point Likert scale to measure self-disclosure, social presence, and intimacy perceptions.
4.3 Privacy findings
In addition to the analyzing corpus of 38 papers based on their theoretical and methodological approaches, it is performed a detailed analysis of their privacy findings. These privacy findings can offer insights into the current trend direction of user studies within the scope of the research field. Papers are categorized based on their privacy findings based on user privacy harms and risks: (1) decision making and manipulation, (2) self-disclosure, (3) human autonomy, bias, and trust, (4) data collection and storage, (5) purpose of use, (6) legal compliance, and (7) data breach and security. This categorization can be seen in Table 6. The timeline of paper publications based on the privacy findings can also be seen in Table 7.
Privacy findings | Papers |
---|---|
Decision making and manipulation | van Eeuwen (2017); Balaji (2019); Cheng and Jiang (2020); de Cosmo et al. (2021); Dev (2022); Kelly et al. (2022); Kim et al. (2022); Lappeman et al. (2023); Pizzi et al. (2023); Völkel et al. (2020); Marjerison et al. (2022) |
Self-disclosure | van Eeuwen (2017); Belen Saglam et al. (2021); Belen-Saglam et al. (2022); Benke et al. (2022); Biswas (2020); Boucher et al. (2021); Cheng and Jiang (2020); de Cosmo et al. (2021); Dev (2022); Fan et al. (2021); Fan et al. (2022); Gieselmann and Sassenberg (2022); Griffin et al. (2021); Ischen et al. (2020); Kim et al. (2022); Lappeman et al. (2023); Lee et al. (2020); Lee et al. (2022); Mercieca (2019); Ng et al. (2020); Pizzi et al. (2023); Sannon et al. (2020); Sohn et al. (2019); Song et al. (2022); Völkel et al. (2020); Widener and Lim (2020); Lim and Shim (2022); Agnihotri and Bhattacharya (2023) |
Human autonomy, bias, and trust | Benke et al. (2022); Bouhia et al. (2022); de Cosmo et al. (2021); Prakash et al. (2023); Sohn et al. (2019); Song et al. (2022); Rese et al. (2020); Tlili et al. (2023); Agnihotri and Bhattacharya (2023) |
Data collection and storage | Dev and Dev (2023); Rajaobelina et al. (2021); Sannon et al. (2020); Tlili et al. (2023); Griffin et al. (2021); Ischen et al. (2020); Völkel et al. (2020); Agnihotri and Bhattacharya (2023) |
Secondary use | Ng et al. (2020); Rodríguez Cardona et al. (2021); Agnihotri and Bhattacharya (2023) |
Legal compliance | Dev (2022); Ng et al. (2020); Rodríguez Cardona et al. (2021); Song et al. (2022); Völkel et al. (2020); Dev and Dev (2023); Rajaobelina et al. (2021) Belen Saglam et al. (2021); Belen-Saglam et al. (2022) |
Data breach and security | Kim et al. (2022); Ng et al. (2020); Sannon et al. (2020); Dev and Dev (2023); Marjerison et al. (2022); Agnihotri and Bhattacharya (2023) |
Year | 2017 | 2018 | 2019 | 2020 | 2021 | 2022 | 2023 |
---|---|---|---|---|---|---|---|
Decision making and manipulation | 1 | 0 | 1 | 1 | 2 | 5 | 2 |
Self-disclosure | 1 | 0 | 1 | 6 | 5 | 10 | 3 |
Human autonomy, bias, and trust | 0 | 0 | 1 | 1 | 1 | 4 | 2 |
Data collection and storage | 0 | 0 | 0 | 1 | 2 | 1 | 3 |
Secondary use | 0 | 0 | 0 | 1 | 1 | 0 | 1 |
Legal compliance, transparency, and consent | 0 | 0 | 0 | 1 | 3 | 4 | 1 |
Data breach and security | 0 | 0 | 0 | 1 | 1 | 2 | 2 |
- Note: All papers were published between 2017 and August 2023.
4.3.1 Decision-making and manipulation
Decision making in conversation chatbots involves selecting appropriate responses based on the input and context, often utilizing predefined rules, ML models, or a combination of both. Manipulation can be defined as a tool to the deliberate control or biasing of chatbot responses to achieve specific goals, such as promoting a product or shaping user behavior, which can raise ethical concerns in AI development. Balaji (2019) emphasized the importance of perceived privacy in users' interactions with chatbots and its potential influence on ethical decision-making. de Cosmo et al. (2021) showed that internet privacy concerns negatively impact users' decisions about using chatbots. Cheng and Jiang (2020) found that users exhibit hesitation in decision-making concerns in AI chatbots, possibly related to the uncertainty about how their data is being used and whether it could lead to negative consequences. Marjerison et al. (2022) found chatbots can manipulate users' behavioral intentions during the interaction. Lappeman et al. (2023) highlighted that while a strong brand can influence decision-making, it alone is insufficient to significantly increase user self-disclosure. Dev (2022) identified sensitive data categories in chatlogs under the European Union's (EU) General Data Protection Regulation (GDPR) that chatbots may manipulate users to disclose. Pizzi et al.'s (2023) findings indicated that users' perceptions of the chatbot's warmth and competence can influence their decision-making regarding the disclosure of personal information. In Völkel et al. (2020), participants could influence the chatbot's personality assessment by around 10% with specific strategies, but they found the process too cumbersome for everyday use. Even though manipulation was possible, participants hesitated to share their profiles with others, highlighting the significance of privacy concerns in chatbot interactions. In Kelly et al. (2022), privacy concerns were negatively associated with behavioral intentions, suggesting that users may be hesitant to use chatbots due to privacy worries. However, when trust was introduced into the model, privacy concerns became a positive predictor of behavioral intentions, implying that as trust in chatbots increases, users are more willing to use them despite their initial privacy concerns. Kelly et al. (2022) also found that privacy concerns may vary depending on the context, with participants altering their perceptions based on different scenarios. Kim et al. (2022) showed that consumers' concerns about highly personalized ads are shaped by their perceptions of the risks and benefits of these ads. These perceptions strongly influence both privacy concerns and decision-making regarding personalized advertisements.
4.3.2 Self-disclosure
Dev (2022) found that manipulating users' decisions through an AI chatbot can cause users to disclose of personal information. Fan et al. (2022) addressed unauthorized disclosure in chatbots. Gieselmann and Sassenberg (2022) also explored that AI chatbots have the potential privacy risk of users' revealing of personal information and being willing to disclose personal information. Belen Saglam et al. (2021) argued that user concerns revolved around disclosing personal information and deleting their personal information and worries about inappropriate use of their data, indicating a desire for more control over their information after interacting with chatbots. Agnihotri and Bhattacharya (2023) also found that users had concerns on losing control of their data against a chatbot. In Belen-Saglam et al.'s (2022), participants showed reluctance to disclose information they considered irrelevant or out of context, with variations observed in different domains such as healthcare and finance. The context and fairness of the data request had a significant impact on participants' comfort levels in disclosing personal information (Belen Saglam et al., 2021).
Benke et al. (2022) investigated the disclosure of innermost emotions and its impact on autonomy and trust in emotion-aware chatbots. They (2022) found that higher control levels in chatbots led to increased autonomy and trust among users. Biswas (2020) indicated that users tend to share and disclose personal identifiable information to chatbots as if they were interacting with a human being. Boucher et al. (2021) found that chatbots can serve as effective adjunctive options in mental health therapy, potentially facilitating increased self-disclosure and improving the effectiveness of in-person therapy. Cheng and Jiang (2020) revealed that users experience hesitation in disclosing information to chatbots and have concerns about sharing their data with third parties. In Lappeman et al. (2023), privacy concerns were found to have a significantly negative impact on user self-disclosure in both treatment groups. Participants who were exposed to their preferred banking brand exhibited lower levels of user self-disclosure and brand trust compared to those exposed to a fictitious banking brand in the South African context.
Lee et al. (2020) found that chatbot self-disclosure had a reciprocal effect, promoting deeper participant self-disclosure over time. It also positively affected participants' perceived intimacy and enjoyment. Lee et al. (2022) revealed that a sense of rapport with the chatbot did not directly affect users' self-disclosure, but it significantly increased the sense of social presence, indirectly influencing self-disclosure intentions. Ng et al. (2020) identified several privacy risks related to chatbot interactions, including willful self-disclosure behavior. They found that a human-like chatbot did not increase participants' trust levels or lower their privacy concerns, despite increasing the perception of social presence. However, the intention to use the presented chatbot for financial support was positively influenced by perceived humanness and trust in the bot. In Lim and Shim (2022), the findings suggested that individuals use AI agents mainly for recreational needs, efficiency in daily tasks, and relaxation. Social presence was found to have a negative association with privacy concerns, meaning that individuals who feel a greater social connection with AI agents are less worried about privacy when disclosing personal information.
Many studies also showed that personalization and anthropomorphism have a direct effect on disclosing personal information to AI chatbots. Mercieca's (Mercieca, 2019) consideration of personalization as human-likeness chatbot technology feature holds. Some studies discussed how personalization can be encouraged through self-disclosure to enhance the user experience by tailoring interactions based on individual preferences (Belen Saglam et al., 2021; de Cosmo et al., 2021; Gieselmann & Sassenberg, 2022; Lappeman et al., 2023; Miao et al., 2017). In Griffin et al. (2021), participants expressed curiosity and perceived chatbots as humanlike. However, some concerns were raised, including the potential for chatbots to provide excessive information, demand lifestyle changes, invade privacy, and experience usability issues on smartphones. Ischen et al. (2020) also found that chatbots designed to be more human-like result in increased disclosure of personal information and adherence to recommendations. This effect was mediated by the perception of higher anthropomorphism, which leads to lower privacy concerns compared to chatbots with a machine-like appearance or behavior (Ischen et al., 2020).
According to Kim et al. (2022), individuals with strong dispositional concerns about privacy would place more weight on the potential risks of disclosing personal information and less weight on the benefits of personalized chatbot ads. Dev (2022) argued that high personalized or anthropomorphized chatbots improves human–chatbot engagement. Fan et al. (2022) and Völkel et al. (2020) considered this to be a tradeoff between privacy risks and personalization benefit, which is called personalization–privacy paradox. Völkel et al. (2020) also argued that personality can be assessed and users tricked by such systems. This increases users' willingness to disclose information in various ways Völkel et al., 2020. Sannon et al. (2020) indicated that social interactivity plays a role in how people perceive agents' privacy-related behaviors and this could lead to the potential privacy risks that users may face in interactions with socially interactive conversational agents.
Pizzi et al. (2023) identified privacy risks related to consumers' willingness to disclose personal information and their purchase intentions when interacting with chatbots. They (2023) found that consumers' perceptions of warmth and competence in the chatbot influenced their skepticism toward the chatbot and, subsequently, their trust toward the service provider hosting the chatbot. Song et al. (2022) also showed that users' willingness to accept service agents is enhanced if they perceive lower privacy risks during the interaction. They also indicated, contrary to expectations, that participants appeared to perceive that human beings have higher privacy risk perceptions compared to chatbots. This suggested that users may believe that human beings are more likely to be motivated by subjective interests to disclose users' privacy (Song et al., 2022). On the other hand, Widener and Lim (2020) determined that perceived humanness had no significant impact on self-disclosure, social presence, or intimacy. They also examined the role of the NTB in interactions with artificial agents, but privacy concerns did not moderate this relationship. The study's implications, particularly regarding the increasing use of AI agent services, were discussed, indicating that users' perceptions of humanness may not greatly influence self-disclosure and other interaction aspects.
4.3.3 Human autonomy, bias, and trust
The studies reviewed show that privacy concerns are closely tied to human autonomy, with individuals expressing unease when their personal data is collected and used without their explicit consent, raising questions about their data control. Bias is identified as a critical factor, as privacy breaches can disproportionately impact certain demographics, perpetuating inequalities, and undermining fairness. Trust is a key issue, with participants being hesitant to engage with systems or platforms they see as untrustworthy in handling their sensitive data, emphasizing the importance of trust-building measures in privacy protection. These findings underscore the multifaceted nature of privacy concerns, spanning issues of autonomy, bias, and trust in the digital age.
Benke et al. (2022) emphasized the importance of user control in enhancing trust and autonomy in AI-based chatbot interactions and provided valuable insights for the design and implementation of privacy-preserving emotion-aware chatbots. Song et al. (2022) pointed out the significance of humanity and ethical clarity in conversational text-based AI chatbots, indicating that their absence could lead to user discomfort and a lack of trust in engaging with the chatbot. Tlili et al.'s (2023) practical implications suggested the development of responsible chatbots in education that go beyond typical privacy issues and focus on preserving human values. Rese et al. (2020) found that privacy concerns and the technology's immaturity negatively affected users' intention to use and the frequency of usage of the chatbot, Emma. Bouhia et al.'s (2022) study indicated that creepiness and perceived risks during interactions with humanoid technology are key factors influencing users' privacy concerns when interacting with chatbots, particularly in contexts where sensitive information is involved (e.g., exchanges about automobile insurance). de Cosmo et al. (2021) and Agnihotri and Bhattacharya (2023) addressed the connection between privacy concerns, trust in AI systems, companies, and data management systems, showing that trust plays a critical role in shaping users' perception of privacy. Additionally, Sohn et al. (2019) and Prakash et al. (2023) delved into the impact of privacy risk on trust formation in AI-based customer service chatbots. In Prakash et al.'s (2023) study on privacy concerns and trust in AI-based customer service chatbots, the research showed that the chatbot's conversational cues shape individuals' perceptions of its functional and social qualities, subsequently influencing trust formation. Trust, in turn, affects people's intentions toward the chatbot. Notably, privacy risk was not a significant predictor of trust in this study.
4.3.4 Data collection and storage
Data collection and storage in conversational chatbots have become integral aspects of modern technology, enabling personalized user experiences and improved service delivery. However, this convenience raises significant privacy concerns, as the vast amounts of data gathered by chatbots may be vulnerable to breaches or misuse, emphasizing the need for robust privacy measures to protect users' sensitive information.
Many studies (Rajaobelina et al., 2021; Sannon et al., 2020) have addressed the risk of AI chatbots' collecting personal information and storage techniques. Agnihotri and Bhattacharya (2023) showed that users expressed significant concerns regarding the process of data collection employed by chatbots. Dev and Dev (2023) found that these chatbots facilitate targeted ads and spam, raising worries about retail bots that aggregate non-removable personal information. In their study, participants preferred not sharing sensitive data, such as social security number and personal addresses, with chatbots because of the massive data collection and storage risk. This risk is also called an invasion of privacy and refers to the unauthorized intrusion into an individual's private life or personal information, encroaching upon their right to maintain confidentiality (Solove, 2006). Similar concerns about the invasion of privacy were also mentioned by Ischen et al. (2020), Griffin et al. (2021), and Völkel et al. (2020). Therefore, it can be argued that in the context of data collection and storage in conversational AI chatbots, an invasion of privacy can occur when these bots gather and store personal information without users' explicit consent or knowledge.
4.3.5 Secondary use
Solove (2006) defines secondary use as a privacy harm which refers to using personal information for in a purpose other than the one for which it was originally collected. Several studies have raised concerns about the privacy implications of the secondary use of users' data. Ng et al. (2020) found that their financial chatbot may collect sensitive information beyond users' expectations, raising privacy concerns. Similarly, Rodríguez Cardona et al. (2021) identified cases of unauthorized secondary use of personal information without informing users the purpose of collection. Moreover, Agnihotri and Bhattacharya (2023) argued that chatbots are designed for maintaining user intentions during interactions, while concurrently serving as instruments for collecting data for various purposes, emphasizing that these chatbots utilize the gathered data for additional objectives. These studies underscore the significance of transparency in data processing to address privacy risks.
4.3.6 Legal compliance
Ensuring that chatbots adhere to relevant privacy laws and regulations, such as the GDPR, was crucial to protect users' personal information and maintain their privacy rights. Dev (2022), Belen Saglam et al. (2021), and Belen-Saglam et al. (2022) emphasized the importance of legal compliance in addressing privacy risks in chatbots. Ng et al. (2020) and Rodríguez Cardona et al. (2021) highlighted concerns about the lack of user control and data privacy rights during the interaction between human and chatbot. They emphasized the importance of considering the dynamic process of interpersonal boundary control in such interactions, which, when applied to interactions with software apps or chatbots, underscores the critical need to empower users with the ability to define and manage the boundaries of their personal information, privacy, and engagement within these digital exchanges. Song et al. (2022) and Völkel et al. (2020) have also shown that while interacting with chatbots, users find themselves in an environment lacking transparency. Rajaobelina et al. (2021) recommended clear and transparent privacy policies, opt-in/opt-out options, and providing onboarding information like FAQs or video tutorials to inform users about the chatbot's operation and data collection. In Dev and Dev's (2023) study, participants expressed concerns about chatbots aggregating information from various sources. They suggested implementing a consent renewal system for better data tracking, reducing third-party sharing, and enhancing chatbot transparency to address these privacy concerns.
4.3.7 Data breaches and security concerns
Several studies have investigated the privacy risks related to data breaches and security concerns in conversational chatbots. Gamble (2020) discussed the potential risks associated with data breaches and security concerns in chatbots. Kim et al. (2022) highlighted the importance of addressing privacy concerns in AI chatbot interactions to ensure consumer trust and protection and security of personal data. Ng et al. (2020) encouraged the need for enhanced security measures to address potential privacy risks and data breaches in AI chatbots, especially when handling sensitive financial information. Marjerison et al. (2022) asserted that data breaches and leakage involving chatbots in the e-commerce space can potentially give rise to security concerns for users. Agnihotri and Bhattacharya (2023) found that data breaches and leakage have a negative effect on users' engagement with chatbots. Sannon et al. (2020) and Dev and Dev (2023) highlighted the need for better security practices and transparency from chatbot service providers to protect users' privacy and data.
5 DISCUSSION AND IMPLICATIONS
After analyzing the literature on conversational text-based AI chatbots in the corpus of 38 studies conducted in the last 5 years, comprehensive explorations of user privacy concerns have been undertaken, featuring both empirical and conceptual explanations. Interestingly, most of the research on user studies (n = 33) has been published after 2020. This shows that there is a growing interest in this area. Methodologically, papers primarily employ either quantitative or mixed-method approaches to comprehend user privacy concerns in human–chatbot interactions with more than 10 participants. Most of the user studies (n = 25) in the corpus had significant number of participants (more than 200). The research with the least participation (n = 8) is conducted by Dev and Dev (2023) using a qualitative approach. This indicates a lack of studies employing qualitative studies in this area (Dev, 2022; Dev & Dev, 2023; Ng et al., 2020; Tlili et al., 2023; Völkel et al., 2020).
It is recognized and acknowledged that there are valuable research contributions toward improving user privacy concerns in conversational text-based AI chatbots. However, it is crucial to emphasize specific research limitations and challenges identified in this literature review that offer opportunities for further exploration and expansion to enhance the overall understanding and development of this field. Thus, more studies are required to investigate how users navigate and contend with the privacy harms and risks associated with conversational text-based AI chatbots. This section presents the existing research challenges and limitations and recommendations for future directions in this research area based on the analysis of the papers.
5.1 Theoretical growth
In this review, an examination of the theoretical approaches employed in various papers reveals a lack of a specific and shared theoretical framework for understanding privacy concerns in conversational text-based AI chatbot studies. It is evident that these studies draw upon theories from three distinct areas: social science, information technology management (technology), and cognitive science. Predominantly, major theories originate from social science (n = 17), with cognitive science contributing the least (n = 1). Within the realm of social science, studies predominantly leverage theories such as innovation diffusion theory, privacy/technology paradox, SPT, SPT2, SRT, and U&G theories. In the technology domain, CASA theory and TAM are the most frequently utilized theories and models to comprehend the impact of chatbots on behaviors. Notably, TAM undergoes evolution over the years (i.e., Kelly et al., 2022; Prakash et al., 2023; Rese et al., 2020) in these studies, specifically in examining privacy risks. In cognitive science, Pizzi et al. (2023) apply the theory of mind to grasp a user's intentions, preferences, and attitudes during interactions with chatbots.
Consequently, a common theoretical understanding across these studies is absent. However, it is evident that these studies effectively explore user behaviors and attitudes in chatbot interactions. This field demonstrates theoretical growth, emphasizing the importance of identifying and consolidating these theories for future studies.
5.2 Focus on privacy concerns
Previous research studies on the user privacy concerns in chatbots revealed that these chatbot users are facing with privacy concerns. These concerns have occurred in the specific subthemes which this review identified: decision-making and manipulation, self-disclosure, human-autonomy dynamics, bias and trust, data collection and storage practices, secondary use of data, legal compliance, transparency, consent, and the potential for data breaches and security breaches. However, researchers generally focused on user privacy concerns from user self-disclosure (e.g., Dev, 2022; Gieselmann & Sassenberg, 2022; van Eeuwen, 2017) and how it impacts user self-disclosure with third parties (e.g., Belen-Saglam et al., 2022; Lee et al., 2022). Some studies have highlighted that self-disclosure often occurs through manipulative design and algorithms which affect user decision-making (e.g., Balaji, 2019; Cheng & Jiang, 2020; de Cosmo et al., 2021; Kim et al., 2022; Lappeman et al., 2023; Pizzi et al., 2023) and the development of trust exacerbates privacy concerns (e.g., Agnihotri & Bhattacharya, 2023; Rajaobelina et al., 2021; Rese et al., 2020; Tlili et al., 2023). Additionally, other studies emphasize the potential risks of data breaches and security challenges related to the safeguarding of user personal data.
Nevertheless, the noteworthy gap exists in the literature concerning a specific definition and/or taxonomy of user privacy harms and risks within the context of conversational AI chatbots. By addressing this gap, future research can contribute to a more nuanced understanding of the challenges users face rather than general privacy concerns. Moreover, it has the potential to contribute to the development of focused strategies aimed at improving privacy protection in interactions involving chatbots.
5.3 Legal and compliance considerations
The existing literature shows the necessity for these chatbots to adhere to specific legal regulations. Hendrickx et al. (2021), Belen Saglam et al. (2021), and Belen-Saglam et al. (2022) emphasize that privacy regulations and the personalization feature of conversational chatbots must be carefully accommodated, especially when dealing with sensitive data like medical information, as privacy is of utmost concern in such contexts. However, Ruane et al. (2019) and Dev and Dev (2023) argue users may unknowingly disclose substantial personal data through natural language interactions with conversational agents, highlighting unique privacy concerns and the importance of complying with legal requirements, including GDPR. In addition, Ruane et al. (2019) emphasize the need for better user understanding and control over their data, how it is processed or stored by the conversational agent, and compliance with relevant laws and regulations. Scrutinizing the implications of data protection laws and regulations on the evolution and deployment of chatbots can offer instructive guidance.
It is worth to notice here, however, there is a lack of research dedicated to exploring specific legal compliance requirements, particularly in recognizing and prioritizing user privacy concerns. Further research is essential to empower users by creating tools and mechanisms that enable them to test greater control over their data and privacy when interacting with chatbots. It is also important to highlight that investigating the effectiveness of privacy settings and user-driven consent mechanisms is essential to contribute to the research on legal and compliance aspects.
5.4 Research alignments with LIS
Interestingly, upon conducting a comprehensive analysis, it has been determined that there is only one study (Gamble, 2020) in the field of Library and Information Science (LIS) (see Figure 8). The primary disciplines represented in this study comprise computer science (n = 9) and communication (n = 6) in this field. It is important to note that addressing privacy harms and risks in conversational text-based AI chatbots, as well as more broadly investigating user interactions and decision-making, are significant to LIS and SI, as well as to interdisciplinary scholarship. This research area aligns perfectly with the field's core focus on the intersection of technology and society, given the increasingly prominent roles that chatbots play in shaping social interactions and delivering services. Emphasizing a user-privacy-centric, ethical, and trustworthy approach within these chatbots is imperative within technology-mediated environments. The interdisciplinary nature of SI provides a unique opportunity to integrate insights from a variety of fields, enabling a comprehensive exploration of privacy challenges in AI chatbots within the broader societal context. By addressing privacy concerns, SI researchers can contribute to the user-privacy-centered, ethical, and trust-building dimensions of technology adoption and policy development in conversational text-based AI chatbots. Fostering interdisciplinary collaboration further enhances the potential for a more thorough understanding of the intricate nature of these issues.

5.5 The roadmap for future user studies
In the literature, several methodological limitations and challenges were addressed while exploring user privacy in conversational chatbots which should be considered in future studies.
First, some papers mention limitations related to small sample sizes (e.g., Balaji, 2019; Dev, 2022; Dev & Dev, 2023; Griffin et al., 2021; van Eeuwen, 2017) or specific demographics (e.g., Cheng & Jiang, 2020; Dev, 2022; Griffin et al., 2021), which may limit the generalizability of the findings. The reliance of some on specific populations, such as students or participants from a single geographic location raises concerns regarding the broader applicability of study outcomes (Rese et al., 2020). Moreover, cultural and contextual differences are highlighted as demographic limitations in some papers (Mercieca, 2019; Song et al., 2022). Second, since user studies are self-reported data, they might not fully reflect participants' actual behaviors or experiences with chatbots (i.e., Belen-Saglam et al., 2022; Gieselmann & Sassenberg, 2022; Ng et al., 2020). Third, some studies mention that they relied on simulated or vignette-based interactions with chatbots rather than real interactions, which could impact the ecological validity of their findings (i.e., Bouhia et al., 2022; Kim et al., 2022). Rese et al. (2020) also mention that their studies focused on short-term interactions with chatbots, which might not capture the dynamics of long-term relationships or evolving user behavior over time. Moreover, some studies (Lappeman et al., 2023; Rese et al., 2020) mention that their results are based on participants' willingness to disclose information or engage with chatbots, which might not fully reflect real-world privacy concerns or behavior. Fifth, some studies used chatbots might not reflect the current state of AI technology (e.g., Lee et al., 2020), such as not accounting for advancements like anthropomorphic cues (Mehta et al., 2022) or voice interactions (Griffin et al., 2021; Rajaobelina et al., 2021). Last, some studies are specific to certain industries or domains, such as retail (e.g., Dev & Dev, 2023; Song et al., 2022; Völkel et al., 2020) or healthcare (e.g., Fan et al., 2021; Griffin et al., 2021) which may not be directly applicable to other contexts.
The nature of user studies can pose a significant challenge in comprehending specific user privacy issues related to conversational chatbots. However, none of the studies have explored the real-world and real-time interaction outputs of these chatbots including actual user data and natural settings tested concerning user privacy harms and risks. Future studies on user privacy in conversational chatbots should prioritize diverse and representative samples, transitioning from simulated to real interactions to enhance ecological validity. It is also essential to capture the evolving dynamics of user relationships with chatbots, and incorporating advancements in AI technology ensures that studies reflect the current landscape. Furthermore, research gaps exist in the realm of effective mitigation strategies including comprehensive examination of the user-privacy-centric chatbot design process. To fill this gap, researchers should focus on suggesting practical and user-privacy awareness techniques, tools, or guidelines. These can be valuable for both developers and users, helping to enhance privacy protection when interacting with chatbots.
6 LIMITATIONS
The goal of this review was to identify and explore the key theories and methods used in research articles and peer-reviewed papers on user privacy harms and risks in conversational text-based AI chatbots, primarily focusing on user interactions. However, this review has several limitations. First, it primarily focuses on user interactions with conversational-oriented and text-based AI chatbots. The author is aware of the existing large literature that focuses on IoT voice agents, chatbot developments, and security attacks. However, due to the primary interest in user interactions with these conversational text-based AI chatbots, studies that clearly discuss the chatbot itself rather than user interactions were excluded. Second, in the realm of literature, chatbots have been identified by different names, such as interactive agents, retail agents, virtual agents, smart bots, and digital assistants. This review only used the terms “virtual assistant,” “conversational agent,” “conversational AI,” and “conversational AI chatbots.” Although digital libraries provide an exclusive search, it is possible that some relevant literature was missed when researchers did not use these terms in their papers but still contributed to the “conversational chatbot” and “privacy” research domain. In the future, terms like “intelligent agent,” “virtual agent,” “relational agent,” or “dialogue system” could be considered to create a more comprehensive data collection list. Third, this review collected papers from four different digital repositories and limited our search to papers published in English. Therefore, the review might have missed some papers outside of these digital libraries. Fourth, this review narrowed down papers to those published in journals, main conference proceedings, and relevant dissertation and master's thesis publications, excluding research that appeared in adjunct proceedings of conferences and books. Future work could also include these sources to understand the trends in this area. Fifth, because this review collected papers manually and search results showed duplicates, the author limited the search with the first 200 papers. Future work could consider further systematic exploration in database. Last, this work mainly focuses on text-based interactions, but future reviews could consider exploring speech and image interfaces as well. Despite these limitations, this review of the existing body of literature sheds light on immediate privacy concerns linked to conversational AI chatbots.
7 CONCLUSIONS
With the emerging conversational text-based AI chatbots have raised significant user privacy concerns. Despite of the chatbot design and developments, user privacy concerns of conversational text-based AI chatbots are severely understudied. To understand this, this paper conducted a literature review with grounded theory approach, analyzing 38 relevant papers in this field. This review primarily analyzed their theoretical and methodological approaches. In addition, it explored challenges that give rise to privacy and security apprehensions in interactions between humans and chatbots. Delving into how the accumulation of user data over time might impact individual privacy and decision-making constitutes an avenue of research warranting further exploration. While technical angles of privacy concerns have been examined in studies, a gap persists in comprehending how users perceive and grasp these concerns. Addressing the above addressed existing research gaps will not only augment our understanding of the privacy implications of conversational AI chatbots but also set the way for more effective strategies and solutions to mitigate these concerns and ensure that interactions remain user-centered while upholding privacy. Furthermore, as the legal landscape continues to evolve (e.g., EU's GDPR, EU's AI Act), research is indispensable in exploring the regulatory aspects linked to privacy concerns with these chatbots. Privacy regulations and the personalization feature of chatbots must be carefully accommodated, especially when dealing with sensitive data like medical information, as privacy is of utmost concern in such contexts (Hasal et al., 2021; Hendrickx et al., 2021; Ruane et al., 2019). Further research is imperative to empower users by developing tools and mechanisms that allow them to wield better control over their data and privacy during interactions with chatbots. Addressing privacy harms and risks in conversational text-based AI chatbots, as well as more broadly investigating user interactions and decision-making, is significant for SI and interdisciplinary scholarship. By addressing privacy harms and risks, researchers and policymakers can contribute to the user-privacy-centered, ethical, and trust-building dimensions of technology adoption and policy development in conversational text-based AI chatbots.
ACKNOWLEDGMENTS
I would like to express my gratitude to Professor Howard Rosenbaum, who guided me throughout this paper. Additionally, I appreciate the helpful comments and guidance provided by the reviewers and the Editor in finalizing the paper to make it beneficial for future research.