database security Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

NETWORK DATABASE SECURITY WITH INTELLECTUAL ACCESS SUPERVISION USING OUTLIER DETECTION TECHNIQUES

Comparison of performance rot13 and caesar cipher method for registration database of vessels berthed at p.t. samudera indonesia.

Database security is a very important aspect of an information system. A general information is onlyintended for certain groups. Therefore, it is very important for a company to prevent database leakage sothat the information contained in it does not fall to unauthorized people. Cryptographic technique is an alternative solution that can be used in database security. One way to maintain the security of the database is to use encryption techniques. The method used to secure the database is encryption using the ROTI3 and Caesar Cipher methods. Both of these methods have advantages in processing speed. For thisreason, the author will compare the use of the two algorithms above in terms of the encryption and decryption process time

A Novel Framework for Efficient Multiple Signature on Certificate with Database Security

Abstract PKI gives undeniable degree of safety by transferring the key pair framework among the clients. By constructing, a PKI we combine digital identities with the digital signatures, which give an end-to-end trust model. Basically, PKI is an attempt, which can simulate the real-world human analyzation of identity and reliability in a computerized fashion. In any case, the existing applications are centered on a tight trust model which makes them inadequate as an overall device for trust examination. After years of research, development and deployment, PKI still facing strong technical and organizational challenges such as attacks against Certificate Authorities (CA). CAs are the primitive component of PKIs which plays powerful role in the PKI model. CA must be diligent, creditable and legitimate. In any case, a technocrat who picks up control on a CA can use CA's certificate to issue bogus certificate and impersonate any site, such as - DigiNotar, GobalSign, Comodo and DigiCert Malaysia. In this paper we proposed an approach to reduce the damage of compromised CA/CA’s key by imposing Multiple Signatures (MS) after verifying/authenticating user’s information. One single compromised CA is not able to issue a certificate to any domain as multiple signatures are required. Private key and other perceptive information are stored in the form of object/blob. Without knowing the structure of class no one can access the object and object output stream. Proposed MS achieve better performance over existing MS schemes and control fraudulent certificate issuance with more database security. The proposed scheme also avoids MITM attack against CA who is issuing certificate to whom which is using the following parameters such as identity of Sender, Receiver, Timestamp and Aadhar number.

A guiding framework for enhancing database security in state-owned universities in Zimbabwe

Technique for evaluating the security of relational databases based on the enhanced clements–hoffman model.

Obtaining convincing evidence of database security, as the basic corporate resource, is extremely important. However, in order to verify the conclusions about the degree of security, it must be measured. To solve this challenge, the authors of the paper enhanced the Clements–Hoffman model, determined the integral security metric and, on this basis, developed a technique for evaluating the security of relational databases. The essence of improving the Clements–Hoffmann model is to expand it by including a set of object vulnerabilities. Vulnerability is considered as a separate objectively existing category. This makes it possible to evaluate both the likelihood of an unwanted incident and the database security as a whole more adequately. The technique for evaluating the main components of the security barriers and the database security as a whole, proposed by the authors, is based on the theory of fuzzy sets and risk. As an integral metric of database security, the reciprocal of the total residual risk is used, the constituent components of which are presented in the form of certain linguistic variables. In accordance with the developed technique, the authors presented the results of a quantitative evaluation of the effectiveness of the protection of databases built on the basis of the schema with the universal basis of relations and designed in accordance with the traditional technology of relational databases.

Hybrid Security Approach for Database Security using Diffusion based cryptography and Diffie-Hellman key exchange Algorithm

Application of network database security technology based on big data technology, database security in a dynamic it world.

Databases are vulnerable. Public statements by Target, Home Depot, and Anthem following their extremely advertised data breaches are each uniform and succinct on how their breaches unfolded: unauthorized access to those systems that ultimately led to the extraction of sensitive information. A comprehensive strategy to secure a database is over data security. Usually, security events will be related to the later action: illegitimate access to data confidentiality damage, injury to the integrity of knowledge, loss of data accessibility (Discover). Loss of privacy of data, creating them accessible to others without a right of access is not visible within the database and does not need changes deductible database. This paper addresses these events to confirm database security.

A Review of Database Security Concepts, Risks, and Problems

Currently, data production is as quick as possible; however, databases are collections of well-organized data that can be accessed, maintained, and updated quickly. Database systems are critical to your company because they convey data about sales transactions, product inventories, customer profiles, and marketing activities. To accomplish data manipulation and maintenance activities the Database Management System considered. Databases differ because their conclusions based on countless rules about what an invulnerable database constitutes. As a result, database protection seekers encounter difficulties in terms of a fantastic figure selection to maintain their database security. The main goal of this study is to identify the risk and how we can secure databases, encrypt sensitive data, modify system databases, and update database systems, as well as to evaluate some of the methods to handle these problems in security databases. However, because information plays such an important role in any organization, understanding the security risk and preventing it from occurring in any database system require a high level of knowledge. As a result, through this paper, all necessary information for any organization has been explained; in addition, also a new technological tool that plays an essential role in database security was discussed.

Database protection model based on security system with full overlap

Security is one of the most important characteristics of the quality of information systems in general and databases, as their main component, in particular. Therefore, the presence of an information protection system, as a complex of software, technical, cryptographic, organizational and other methods, means and measures that ensure the integrity, confidentiality, authenticity and availability of information in conditions of exposure to natural or artificial threats, is an integral feature of almost any modern information system and database. At the same time, in order to be able to verify the conclusions about the degree of security, it must be measured in some way. The paper considers a database security model based on a full overlap security model (a covered security system), which is traditionally considered the basis for a formal description of security systems. Thanks to expanding the Clements-Hoffman model by including a set of vulnerabilities (as a separately objectively existing category necessary to describe a weakness of an asset or control that can be exploited by one or more threats), which makes it possible to assess more adequately the likelihood of an unwanted incident (threat realization) in a two-factor model (in which one of the factors reflects the motivational component of the threat, and the second takes into account the existing vulnerabilities); a defined integral indicator of database security (as a value inverse to the total residual risk, the constituent components of which are represented in the form of the corresponding linguistic variables); the developed technique for assessing the main components of security barriers and the security of the database as a whole, based on the theory of fuzzy sets and risk, it becomes possible to use the developed model to conduct a quantitative assessment of the security of the analyzed database.

Export Citation Format

Share document.

Database security threats: A survey study

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

  • Survey Paper
  • Open access
  • Published: 01 July 2020

Cybersecurity data science: an overview from machine learning perspective

  • Iqbal H. Sarker   ORCID: orcid.org/0000-0003-1740-5517 1 , 2 ,
  • A. S. M. Kayes 3 ,
  • Shahriar Badsha 4 ,
  • Hamed Alqahtani 5 ,
  • Paul Watters 3 &
  • Alex Ng 3  

Journal of Big Data volume  7 , Article number:  41 ( 2020 ) Cite this article

140k Accesses

235 Citations

51 Altmetric

Metrics details

In a computing context, cybersecurity is undergoing massive shifts in technology and its operations in recent days, and data science is driving the change. Extracting security incident patterns or insights from cybersecurity data and building corresponding data-driven model , is the key to make a security system automated and intelligent. To understand and analyze the actual phenomena with data, various scientific methods, machine learning techniques, processes, and systems are used, which is commonly known as data science. In this paper, we focus and briefly discuss on cybersecurity data science , where the data is being gathered from relevant cybersecurity sources, and the analytics complement the latest data-driven patterns for providing more effective security solutions. The concept of cybersecurity data science allows making the computing process more actionable and intelligent as compared to traditional ones in the domain of cybersecurity. We then discuss and summarize a number of associated research issues and future directions . Furthermore, we provide a machine learning based multi-layered framework for the purpose of cybersecurity modeling. Overall, our goal is not only to discuss cybersecurity data science and relevant methods but also to focus the applicability towards data-driven intelligent decision making for protecting the systems from cyber-attacks.

Introduction

Due to the increasing dependency on digitalization and Internet-of-Things (IoT) [ 1 ], various security incidents such as unauthorized access [ 2 ], malware attack [ 3 ], zero-day attack [ 4 ], data breach [ 5 ], denial of service (DoS) [ 2 ], social engineering or phishing [ 6 ] etc. have grown at an exponential rate in recent years. For instance, in 2010, there were less than 50 million unique malware executables known to the security community. By 2012, they were double around 100 million, and in 2019, there are more than 900 million malicious executables known to the security community, and this number is likely to grow, according to the statistics of AV-TEST institute in Germany [ 7 ]. Cybercrime and attacks can cause devastating financial losses and affect organizations and individuals as well. It’s estimated that, a data breach costs 8.19 million USD for the United States and 3.9 million USD on an average [ 8 ], and the annual cost to the global economy from cybercrime is 400 billion USD [ 9 ]. According to Juniper Research [ 10 ], the number of records breached each year to nearly triple over the next 5 years. Thus, it’s essential that organizations need to adopt and implement a strong cybersecurity approach to mitigate the loss. According to [ 11 ], the national security of a country depends on the business, government, and individual citizens having access to applications and tools which are highly secure, and the capability on detecting and eliminating such cyber-threats in a timely way. Therefore, to effectively identify various cyber incidents either previously seen or unseen, and intelligently protect the relevant systems from such cyber-attacks, is a key issue to be solved urgently.

figure 1

Popularity trends of data science, machine learning and cybersecurity over time, where x-axis represents the timestamp information and y axis represents the corresponding popularity values

Cybersecurity is a set of technologies and processes designed to protect computers, networks, programs and data from attack, damage, or unauthorized access [ 12 ]. In recent days, cybersecurity is undergoing massive shifts in technology and its operations in the context of computing, and data science (DS) is driving the change, where machine learning (ML), a core part of “Artificial Intelligence” (AI) can play a vital role to discover the insights from data. Machine learning can significantly change the cybersecurity landscape and data science is leading a new scientific paradigm [ 13 , 14 ]. The popularity of these related technologies is increasing day-by-day, which is shown in Fig.  1 , based on the data of the last five years collected from Google Trends [ 15 ]. The figure represents timestamp information in terms of a particular date in the x-axis and corresponding popularity in the range of 0 (minimum) to 100 (maximum) in the y-axis. As shown in Fig.  1 , the popularity indication values of these areas are less than 30 in 2014, while they exceed 70 in 2019, i.e., more than double in terms of increased popularity. In this paper, we focus on cybersecurity data science (CDS), which is broadly related to these areas in terms of security data processing techniques and intelligent decision making in real-world applications. Overall, CDS is security data-focused, applies machine learning methods to quantify cyber risks, and ultimately seeks to optimize cybersecurity operations. Thus, the purpose of this paper is for those academia and industry people who want to study and develop a data-driven smart cybersecurity model based on machine learning techniques. Therefore, great emphasis is placed on a thorough description of various types of machine learning methods, and their relations and usage in the context of cybersecurity. This paper does not describe all of the different techniques used in cybersecurity in detail; instead, it gives an overview of cybersecurity data science modeling based on artificial intelligence, particularly from machine learning perspective.

The ultimate goal of cybersecurity data science is data-driven intelligent decision making from security data for smart cybersecurity solutions. CDS represents a partial paradigm shift from traditional well-known security solutions such as firewalls, user authentication and access control, cryptography systems etc. that might not be effective according to today’s need in cyber industry [ 16 , 17 , 18 , 19 ]. The problems are these are typically handled statically by a few experienced security analysts, where data management is done in an ad-hoc manner [ 20 , 21 ]. However, as an increasing number of cybersecurity incidents in different formats mentioned above continuously appear over time, such conventional solutions have encountered limitations in mitigating such cyber risks. As a result, numerous advanced attacks are created and spread very quickly throughout the Internet. Although several researchers use various data analysis and learning techniques to build cybersecurity models that are summarized in “ Machine learning tasks in cybersecurity ” section, a comprehensive security model based on the effective discovery of security insights and latest security patterns could be more useful. To address this issue, we need to develop more flexible and efficient security mechanisms that can respond to threats and to update security policies to mitigate them intelligently in a timely manner. To achieve this goal, it is inherently required to analyze a massive amount of relevant cybersecurity data generated from various sources such as network and system sources, and to discover insights or proper security policies with minimal human intervention in an automated manner.

Analyzing cybersecurity data and building the right tools and processes to successfully protect against cybersecurity incidents goes beyond a simple set of functional requirements and knowledge about risks, threats or vulnerabilities. For effectively extracting the insights or the patterns of security incidents, several machine learning techniques, such as feature engineering, data clustering, classification, and association analysis, or neural network-based deep learning techniques can be used, which are briefly discussed in “ Machine learning tasks in cybersecurity ” section. These learning techniques are capable to find the anomalies or malicious behavior and data-driven patterns of associated security incidents to make an intelligent decision. Thus, based on the concept of data-driven decision making, we aim to focus on cybersecurity data science , where the data is being gathered from relevant cybersecurity sources such as network activity, database activity, application activity, or user activity, and the analytics complement the latest data-driven patterns for providing corresponding security solutions.

The contributions of this paper are summarized as follows.

We first make a brief discussion on the concept of cybersecurity data science and relevant methods to understand its applicability towards data-driven intelligent decision making in the domain of cybersecurity. For this purpose, we also make a review and brief discussion on different machine learning tasks in cybersecurity, and summarize various cybersecurity datasets highlighting their usage in different data-driven cyber applications.

We then discuss and summarize a number of associated research issues and future directions in the area of cybersecurity data science, that could help both the academia and industry people to further research and development in relevant application areas.

Finally, we provide a generic multi-layered framework of the cybersecurity data science model based on machine learning techniques. In this framework, we briefly discuss how the cybersecurity data science model can be used to discover useful insights from security data and making data-driven intelligent decisions to build smart cybersecurity systems.

The remainder of the paper is organized as follows. “ Background ” section summarizes background of our study and gives an overview of the related technologies of cybersecurity data science. “ Cybersecurity data science ” section defines and discusses briefly about cybersecurity data science including various categories of cyber incidents data. In “  Machine learning tasks in cybersecurity ” section, we briefly discuss various categories of machine learning techniques including their relations with cybersecurity tasks and summarize a number of machine learning based cybersecurity models in the field. “ Research issues and future directions ” section briefly discusses and highlights various research issues and future directions in the area of cybersecurity data science. In “  A multi-layered framework for smart cybersecurity services ” section, we suggest a machine learning-based framework to build cybersecurity data science model and discuss various layers with their roles. In “  Discussion ” section, we highlight several key points regarding our studies. Finally,  “ Conclusion ” section concludes this paper.

In this section, we give an overview of the related technologies of cybersecurity data science including various types of cybersecurity incidents and defense strategies.

  • Cybersecurity

Over the last half-century, the information and communication technology (ICT) industry has evolved greatly, which is ubiquitous and closely integrated with our modern society. Thus, protecting ICT systems and applications from cyber-attacks has been greatly concerned by the security policymakers in recent days [ 22 ]. The act of protecting ICT systems from various cyber-threats or attacks has come to be known as cybersecurity [ 9 ]. Several aspects are associated with cybersecurity: measures to protect information and communication technology; the raw data and information it contains and their processing and transmitting; associated virtual and physical elements of the systems; the degree of protection resulting from the application of those measures; and eventually the associated field of professional endeavor [ 23 ]. Craigen et al. defined “cybersecurity as a set of tools, practices, and guidelines that can be used to protect computer networks, software programs, and data from attack, damage, or unauthorized access” [ 24 ]. According to Aftergood et al. [ 12 ], “cybersecurity is a set of technologies and processes designed to protect computers, networks, programs and data from attacks and unauthorized access, alteration, or destruction”. Overall, cybersecurity concerns with the understanding of diverse cyber-attacks and devising corresponding defense strategies that preserve several properties defined as below [ 25 , 26 ].

Confidentiality is a property used to prevent the access and disclosure of information to unauthorized individuals, entities or systems.

Integrity is a property used to prevent any modification or destruction of information in an unauthorized manner.

Availability is a property used to ensure timely and reliable access of information assets and systems to an authorized entity.

The term cybersecurity applies in a variety of contexts, from business to mobile computing, and can be divided into several common categories. These are - network security that mainly focuses on securing a computer network from cyber attackers or intruders; application security that takes into account keeping the software and the devices free of risks or cyber-threats; information security that mainly considers security and the privacy of relevant data; operational security that includes the processes of handling and protecting data assets. Typical cybersecurity systems are composed of network security systems and computer security systems containing a firewall, antivirus software, or an intrusion detection system [ 27 ].

Cyberattacks and security risks

The risks typically associated with any attack, which considers three security factors, such as threats, i.e., who is attacking, vulnerabilities, i.e., the weaknesses they are attacking, and impacts, i.e., what the attack does [ 9 ]. A security incident is an act that threatens the confidentiality, integrity, or availability of information assets and systems. Several types of cybersecurity incidents that may result in security risks on an organization’s systems and networks or an individual [ 2 ]. These are:

Unauthorized access that describes the act of accessing information to network, systems or data without authorization that results in a violation of a security policy [ 2 ];

Malware known as malicious software, is any program or software that intentionally designed to cause damage to a computer, client, server, or computer network, e.g., botnets. Examples of different types of malware including computer viruses, worms, Trojan horses, adware, ransomware, spyware, malicious bots, etc. [ 3 , 26 ]; Ransom malware, or ransomware , is an emerging form of malware that prevents users from accessing their systems or personal files, or the devices, then demands an anonymous online payment in order to restore access.

Denial-of-Service is an attack meant to shut down a machine or network, making it inaccessible to its intended users by flooding the target with traffic that triggers a crash. The Denial-of-Service (DoS) attack typically uses one computer with an Internet connection, while distributed denial-of-service (DDoS) attack uses multiple computers and Internet connections to flood the targeted resource [ 2 ];

Phishing a type of social engineering , used for a broad range of malicious activities accomplished through human interactions, in which the fraudulent attempt takes part to obtain sensitive information such as banking and credit card details, login credentials, or personally identifiable information by disguising oneself as a trusted individual or entity via an electronic communication such as email, text, or instant message, etc. [ 26 ];

Zero-day attack is considered as the term that is used to describe the threat of an unknown security vulnerability for which either the patch has not been released or the application developers were unaware [ 4 , 28 ].

Beside these attacks mentioned above, privilege escalation [ 29 ], password attack [ 30 ], insider threat [ 31 ], man-in-the-middle [ 32 ], advanced persistent threat [ 33 ], SQL injection attack [ 34 ], cryptojacking attack [ 35 ], web application attack [ 30 ] etc. are well-known as security incidents in the field of cybersecurity. A data breach is another type of security incident, known as a data leak, which is involved in the unauthorized access of data by an individual, application, or service [ 5 ]. Thus, all data breaches are considered as security incidents, however, all the security incidents are not data breaches. Most data breaches occur in the banking industry involving the credit card numbers, personal information, followed by the healthcare sector and the public sector [ 36 ].

Cybersecurity defense strategies

Defense strategies are needed to protect data or information, information systems, and networks from cyber-attacks or intrusions. More granularly, they are responsible for preventing data breaches or security incidents and monitoring and reacting to intrusions, which can be defined as any kind of unauthorized activity that causes damage to an information system [ 37 ]. An intrusion detection system (IDS) is typically represented as “a device or software application that monitors a computer network or systems for malicious activity or policy violations” [ 38 ]. The traditional well-known security solutions such as anti-virus, firewalls, user authentication, access control, data encryption and cryptography systems, however might not be effective according to today’s need in the cyber industry

[ 16 , 17 , 18 , 19 ]. On the other hand, IDS resolves the issues by analyzing security data from several key points in a computer network or system [ 39 , 40 ]. Moreover, intrusion detection systems can be used to detect both internal and external attacks.

Intrusion detection systems are different categories according to the usage scope. For instance, a host-based intrusion detection system (HIDS), and network intrusion detection system (NIDS) are the most common types based on the scope of single computers to large networks. In a HIDS, the system monitors important files on an individual system, while it analyzes and monitors network connections for suspicious traffic in a NIDS. Similarly, based on methodologies, the signature-based IDS, and anomaly-based IDS are the most well-known variants [ 37 ].

Signature-based IDS : A signature can be a predefined string, pattern, or rule that corresponds to a known attack. A particular pattern is identified as the detection of corresponding attacks in a signature-based IDS. An example of a signature can be known patterns or a byte sequence in a network traffic, or sequences used by malware. To detect the attacks, anti-virus software uses such types of sequences or patterns as a signature while performing the matching operation. Signature-based IDS is also known as knowledge-based or misuse detection [ 41 ]. This technique can be efficient to process a high volume of network traffic, however, is strictly limited to the known attacks only. Thus, detecting new attacks or unseen attacks is one of the biggest challenges faced by this signature-based system.

Anomaly-based IDS : The concept of anomaly-based detection overcomes the issues of signature-based IDS discussed above. In an anomaly-based intrusion detection system, the behavior of the network is first examined to find dynamic patterns, to automatically create a data-driven model, to profile the normal behavior, and thus it detects deviations in the case of any anomalies [ 41 ]. Thus, anomaly-based IDS can be treated as a dynamic approach, which follows behavior-oriented detection. The main advantage of anomaly-based IDS is the ability to identify unknown or zero-day attacks [ 42 ]. However, the issue is that the identified anomaly or abnormal behavior is not always an indicator of intrusions. It sometimes may happen because of several factors such as policy changes or offering a new service.

In addition, a hybrid detection approach [ 43 , 44 ] that takes into account both the misuse and anomaly-based techniques discussed above can be used to detect intrusions. In a hybrid system, the misuse detection system is used for detecting known types of intrusions and anomaly detection system is used for novel attacks [ 45 ]. Beside these approaches, stateful protocol analysis can also be used to detect intrusions that identifies deviations of protocol state similarly to the anomaly-based method, however it uses predetermined universal profiles based on accepted definitions of benign activity [ 41 ]. In Table 1 , we have summarized these common approaches highlighting their pros and cons. Once the detecting has been completed, the intrusion prevention system (IPS) that is intended to prevent malicious events, can be used to mitigate the risks in different ways such as manual, providing notification, or automatic process [ 46 ]. Among these approaches, an automatic response system could be more effective as it does not involve a human interface between the detection and response systems.

  • Data science

We are living in the age of data, advanced analytics, and data science, which are related to data-driven intelligent decision making. Although, the process of searching patterns or discovering hidden and interesting knowledge from data is known as data mining [ 47 ], in this paper, we use the broader term “data science” rather than data mining. The reason is that, data science, in its most fundamental form, is all about understanding of data. It involves studying, processing, and extracting valuable insights from a set of information. In addition to data mining, data analytics is also related to data science. The development of data mining, knowledge discovery, and machine learning that refers creating algorithms and program which learn on their own, together with the original data analysis and descriptive analytics from the statistical perspective, forms the general concept of “data analytics” [ 47 ]. Nowadays, many researchers use the term “data science” to describe the interdisciplinary field of data collection, preprocessing, inferring, or making decisions by analyzing the data. To understand and analyze the actual phenomena with data, various scientific methods, machine learning techniques, processes, and systems are used, which is commonly known as data science. According to Cao et al. [ 47 ] “data science is a new interdisciplinary field that synthesizes and builds on statistics, informatics, computing, communication, management, and sociology to study data and its environments, to transform data to insights and decisions by following a data-to-knowledge-to-wisdom thinking and methodology”. As a high-level statement in the context of cybersecurity, we can conclude that it is the study of security data to provide data-driven solutions for the given security problems, as known as “the science of cybersecurity data”. Figure 2 shows the typical data-to-insight-to-decision transfer at different periods and general analytic stages in data science, in terms of a variety of analytics goals (G) and approaches (A) to achieve the data-to-decision goal [ 47 ].

figure 2

Data-to-insight-to-decision analytic stages in data science [ 47 ]

Based on the analytic power of data science including machine learning techniques, it can be a viable component of security strategies. By using data science techniques, security analysts can manipulate and analyze security data more effectively and efficiently, uncovering valuable insights from data. Thus, data science methodologies including machine learning techniques can be well utilized in the context of cybersecurity, in terms of problem understanding, gathering security data from diverse sources, preparing data to feed into the model, data-driven model building and updating, for providing smart security services, which motivates to define cybersecurity data science and to work in this research area.

Cybersecurity data science

In this section, we briefly discuss cybersecurity data science including various categories of cyber incidents data with the usage in different application areas, and the key terms and areas related to our study.

Understanding cybersecurity data

Data science is largely driven by the availability of data [ 48 ]. Datasets typically represent a collection of information records that consist of several attributes or features and related facts, in which cybersecurity data science is based on. Thus, it’s important to understand the nature of cybersecurity data containing various types of cyberattacks and relevant features. The reason is that raw security data collected from relevant cyber sources can be used to analyze the various patterns of security incidents or malicious behavior, to build a data-driven security model to achieve our goal. Several datasets exist in the area of cybersecurity including intrusion analysis, malware analysis, anomaly, fraud, or spam analysis that are used for various purposes. In Table 2 , we summarize several such datasets including their various features and attacks that are accessible on the Internet, and highlight their usage based on machine learning techniques in different cyber applications. Effectively analyzing and processing of these security features, building target machine learning-based security model according to the requirements, and eventually, data-driven decision making, could play a role to provide intelligent cybersecurity services that are discussed briefly in “ A multi-layered framework for smart cybersecurity services ” section.

Defining cybersecurity data science

Data science is transforming the world’s industries. It is critically important for the future of intelligent cybersecurity systems and services because of “security is all about data”. When we seek to detect cyber threats, we are analyzing the security data in the form of files, logs, network packets, or other relevant sources. Traditionally, security professionals didn’t use data science techniques to make detections based on these data sources. Instead, they used file hashes, custom-written rules like signatures, or manually defined heuristics [ 21 ]. Although these techniques have their own merits in several cases, it needs too much manual work to keep up with the changing cyber threat landscape. On the contrary, data science can make a massive shift in technology and its operations, where machine learning algorithms can be used to learn or extract insight of security incident patterns from the training data for their detection and prevention. For instance, to detect malware or suspicious trends, or to extract policy rules, these techniques can be used.

In recent days, the entire security industry is moving towards data science, because of its capability to transform raw data into decision making. To do this, several data-driven tasks can be associated, such as—(i) data engineering focusing practical applications of data gathering and analysis; (ii) reducing data volume that deals with filtering significant and relevant data to further analysis; (iii) discovery and detection that focuses on extracting insight or incident patterns or knowledge from data; (iv) automated models that focus on building data-driven intelligent security model; (v) targeted security  alerts focusing on the generation of remarkable security alerts based on discovered knowledge that minimizes the false alerts, and (vi) resource optimization that deals with the available resources to achieve the target goals in a security system. While making data-driven decisions, behavioral analysis could also play a significant role in the domain of cybersecurity [ 81 ].

Thus, the concept of cybersecurity data science incorporates the methods and techniques of data science and machine learning as well as the behavioral analytics of various security incidents. The combination of these technologies has given birth to the term “cybersecurity data science”, which refers to collect a large amount of security event data from different sources and analyze it using machine learning technologies for detecting security risks or attacks either through the discovery of useful insights or the latest data-driven patterns. It is, however, worth remembering that cybersecurity data science is not just about a collection of machine learning algorithms, rather,  a process that can help security professionals or analysts to scale and automate their security activities in a smart way and in a timely manner. Therefore, the formal definition can be as follows: “Cybersecurity data science is a research or working area existing at the intersection of cybersecurity, data science, and machine learning or artificial intelligence, which is mainly security data-focused, applies machine learning methods, attempts to quantify cyber-risks or incidents, and promotes inferential techniques to analyze behavioral patterns in security data. It also focuses on generating security response alerts, and eventually seeks for optimizing cybersecurity solutions, to build automated and intelligent cybersecurity systems.”

Table  3 highlights some key terms associated with cybersecurity data science. Overall, the outputs of cybersecurity data science are typically security data products, which can be a data-driven security model, policy rule discovery, risk or attack prediction, potential security service and recommendation, or the corresponding security system depending on the given security problem in the domain of cybersecurity. In the next section, we briefly discuss various machine learning tasks with examples within the scope of our study.

Machine learning tasks in cybersecurity

Machine learning (ML) is typically considered as a branch of “Artificial Intelligence”, which is closely related to computational statistics, data mining and analytics, data science, particularly focusing on making the computers to learn from data [ 82 , 83 ]. Thus, machine learning models typically comprise of a set of rules, methods, or complex “transfer functions” that can be applied to find interesting data patterns, or to recognize or predict behavior [ 84 ], which could play an important role in the area of cybersecurity. In the following, we discuss different methods that can be used to solve machine learning tasks and how they are related to cybersecurity tasks.

Supervised learning

Supervised learning is performed when specific targets are defined to reach from a certain set of inputs, i.e., task-driven approach. In the area of machine learning, the most popular supervised learning techniques are known as classification and regression methods [ 129 ]. These techniques are popular to classify or predict the future for a particular security problem. For instance, to predict denial-of-service attack (yes, no) or to identify different classes of network attacks such as scanning and spoofing, classification techniques can be used in the cybersecurity domain. ZeroR [ 83 ], OneR [ 130 ], Navies Bayes [ 131 ], Decision Tree [ 132 , 133 ], K-nearest neighbors [ 134 ], support vector machines [ 135 ], adaptive boosting [ 136 ], and logistic regression [ 137 ] are the well-known classification techniques. In addition, recently Sarker et al. have proposed BehavDT [ 133 ], and IntruDtree [ 106 ] classification techniques that are able to effectively build a data-driven predictive model. On the other hand, to predict the continuous or numeric value, e.g., total phishing attacks in a certain period or predicting the network packet parameters, regression techniques are useful. Regression analyses can also be used to detect the root causes of cybercrime and other types of fraud [ 138 ]. Linear regression [ 82 ], support vector regression [ 135 ] are the popular regression techniques. The main difference between classification and regression is that the output variable in the regression is numerical or continuous, while the predicted output for classification is categorical or discrete. Ensemble learning is an extension of supervised learning while mixing different simple models, e.g., Random Forest learning [ 139 ] that generates multiple decision trees to solve a particular security task.

Unsupervised learning

In unsupervised learning problems, the main task is to find patterns, structures, or knowledge in unlabeled data, i.e., data-driven approach [ 140 ]. In the area of cybersecurity, cyber-attacks like malware stays hidden in some ways, include changing their behavior dynamically and autonomously to avoid detection. Clustering techniques, a type of unsupervised learning, can help to uncover the hidden patterns and structures from the datasets, to identify indicators of such sophisticated attacks. Similarly, in identifying anomalies, policy violations, detecting, and eliminating noisy instances in data, clustering techniques can be useful. K-means [ 141 ], K-medoids [ 142 ] are the popular partitioning clustering algorithms, and single linkage [ 143 ] or complete linkage [ 144 ] are the well-known hierarchical clustering algorithms used in various application domains. Moreover, a bottom-up clustering approach proposed by Sarker et al. [ 145 ] can also be used by taking into account the data characteristics.

Besides, feature engineering tasks like optimal feature selection or extraction related to a particular security problem could be useful for further analysis [ 106 ]. Recently, Sarker et al. [ 106 ] have proposed an approach for selecting security features according to their importance score values. Moreover, Principal component analysis, linear discriminant analysis, pearson correlation analysis, or non-negative matrix factorization are the popular dimensionality reduction techniques to solve such issues [ 82 ]. Association rule learning is another example, where machine learning based policy rules can prevent cyber-attacks. In an expert system, the rules are usually manually defined by a knowledge engineer working in collaboration with a domain expert [ 37 , 140 , 146 ]. Association rule learning on the contrary, is the discovery of rules or relationships among a set of available security features or attributes in a given dataset [ 147 ]. To quantify the strength of relationships, correlation analysis can be used [ 138 ]. Many association rule mining algorithms have been proposed in the area of machine learning and data mining literature, such as logic-based [ 148 ], frequent pattern based [ 149 , 150 , 151 ], tree-based [ 152 ], etc. Recently, Sarker et al. [ 153 ] have proposed an association rule learning approach considering non-redundant generation, that can be used to discover a set of useful security policy rules. Moreover, AIS [ 147 ], Apriori [ 149 ], Apriori-TID and Apriori-Hybrid [ 149 ], FP-Tree [ 152 ], and RARM [ 154 ], and Eclat [ 155 ] are the well-known association rule learning algorithms that are capable to solve such problems by generating a set of policy rules in the domain of cybersecurity.

Neural networks and deep learning

Deep learning is a part of machine learning in the area of artificial intelligence, which is a computational model that is inspired by the biological neural networks in the human brain [ 82 ]. Artificial Neural Network (ANN) is frequently used in deep learning and the most popular neural network algorithm is backpropagation [ 82 ]. It performs learning on a multi-layer feed-forward neural network consists of an input layer, one or more hidden layers, and an output layer. The main difference between deep learning and classical machine learning is its performance on the amount of security data increases. Typically deep learning algorithms perform well when the data volumes are large, whereas machine learning algorithms perform comparatively better on small datasets [ 44 ]. In our earlier work, Sarker et al. [ 129 ], we have illustrated the effectiveness of these approaches considering contextual datasets. However, deep learning approaches mimic the human brain mechanism to interpret large amount of data or the complex data such as images, sounds and texts [ 44 , 129 ]. In terms of feature extraction to build models, deep learning reduces the effort of designing a feature extractor for each problem than the classical machine learning techniques. Beside these characteristics, deep learning typically takes a long time to train an algorithm than a machine learning algorithm, however, the test time is exactly the opposite [ 44 ]. Thus, deep learning relies more on high-performance machines with GPUs than classical machine-learning algorithms [ 44 , 156 ]. The most popular deep neural network learning models include multi-layer perceptron (MLP) [ 157 ], convolutional neural network (CNN) [ 158 ], recurrent neural network (RNN) or long-short term memory (LSTM) network [ 121 , 158 ]. In recent days, researchers use these deep learning techniques for different purposes such as detecting network intrusions, malware traffic detection and classification, etc. in the domain of cybersecurity [ 44 , 159 ].

Other learning techniques

Semi-supervised learning can be described as a hybridization of supervised and unsupervised techniques discussed above, as it works on both the labeled and unlabeled data. In the area of cybersecurity, it could be useful, when it requires to label data automatically without human intervention, to improve the performance of cybersecurity models. Reinforcement techniques are another type of machine learning that characterizes an agent by creating its own learning experiences through interacting directly with the environment, i.e., environment-driven approach, where the environment is typically formulated as a Markov decision process and take decision based on a reward function [ 160 ]. Monte Carlo learning, Q-learning, Deep Q Networks, are the most common reinforcement learning algorithms [ 161 ]. For instance, in a recent work [ 126 ], the authors present an approach for detecting botnet traffic or malicious cyber activities using reinforcement learning combining with neural network classifier. In another work [ 128 ], the authors discuss about the application of deep reinforcement learning to intrusion detection for supervised problems, where they received the best results for the Deep Q-Network algorithm. In the context of cybersecurity, genetic algorithms that use fitness, selection, crossover, and mutation for finding optimization, could also be used to solve a similar class of learning problems [ 119 ].

Various types of machine learning techniques discussed above can be useful in the domain of cybersecurity, to build an effective security model. In Table  4 , we have summarized several machine learning techniques that are used to build various types of security models for various purposes. Although these models typically represent a learning-based security model, in this paper, we aim to focus on a comprehensive cybersecurity data science model and relevant issues, in order to build a data-driven intelligent security system. In the next section, we highlight several research issues and potential solutions in the area of cybersecurity data science.

Research issues and future directions

Our study opens several research issues and challenges in the area of cybersecurity data science to extract insight from relevant data towards data-driven intelligent decision making for cybersecurity solutions. In the following, we summarize these challenges ranging from data collection to decision making.

Cybersecurity datasets : Source datasets are the primary component to work in the area of cybersecurity data science. Most of the existing datasets are old and might insufficient in terms of understanding the recent behavioral patterns of various cyber-attacks. Although the data can be transformed into a meaningful understanding level after performing several processing tasks, there is still a lack of understanding of the characteristics of recent attacks and their patterns of happening. Thus, further processing or machine learning algorithms may provide a low accuracy rate for making the target decisions. Therefore, establishing a large number of recent datasets for a particular problem domain like cyber risk prediction or intrusion detection is needed, which could be one of the major challenges in cybersecurity data science.

Handling quality problems in cybersecurity datasets : The cyber datasets might be noisy, incomplete, insignificant, imbalanced, or may contain inconsistency instances related to a particular security incident. Such problems in a data set may affect the quality of the learning process and degrade the performance of the machine learning-based models [ 162 ]. To make a data-driven intelligent decision for cybersecurity solutions, such problems in data is needed to deal effectively before building the cyber models. Therefore, understanding such problems in cyber data and effectively handling such problems using existing algorithms or newly proposed algorithm for a particular problem domain like malware analysis or intrusion detection and prevention is needed, which could be another research issue in cybersecurity data science.

Security policy rule generation : Security policy rules reference security zones and enable a user to allow, restrict, and track traffic on the network based on the corresponding user or user group, and service, or the application. The policy rules including the general and more specific rules are compared against the incoming traffic in sequence during the execution, and the rule that matches the traffic is applied. The policy rules used in most of the cybersecurity systems are static and generated by human expertise or ontology-based [ 163 , 164 ]. Although, association rule learning techniques produce rules from data, however, there is a problem of redundancy generation [ 153 ] that makes the policy rule-set complex. Therefore, understanding such problems in policy rule generation and effectively handling such problems using existing algorithms or newly proposed algorithm for a particular problem domain like access control [ 165 ] is needed, which could be another research issue in cybersecurity data science.

Hybrid learning method : Most commercial products in the cybersecurity domain contain signature-based intrusion detection techniques [ 41 ]. However, missing features or insufficient profiling can cause these techniques to miss unknown attacks. In that case, anomaly-based detection techniques or hybrid technique combining signature-based and anomaly-based can be used to overcome such issues. A hybrid technique combining multiple learning techniques or a combination of deep learning and machine-learning methods can be used to extract the target insight for a particular problem domain like intrusion detection, malware analysis, access control, etc. and make the intelligent decision for corresponding cybersecurity solutions.

Protecting the valuable security information : Another issue of a cyber data attack is the loss of extremely valuable data and information, which could be damaging for an organization. With the use of encryption or highly complex signatures, one can stop others from probing into a dataset. In such cases, cybersecurity data science can be used to build a data-driven impenetrable protocol to protect such security information. To achieve this goal, cyber analysts can develop algorithms by analyzing the history of cyberattacks to detect the most frequently targeted chunks of data. Thus, understanding such data protecting problems and designing corresponding algorithms to effectively handling these problems, could be another research issue in the area of cybersecurity data science.

Context-awareness in cybersecurity : Existing cybersecurity work mainly originates from the relevant cyber data containing several low-level features. When data mining and machine learning techniques are applied to such datasets, a related pattern can be identified that describes it properly. However, a broader contextual information [ 140 , 145 , 166 ] like temporal, spatial, relationship among events or connections, dependency can be used to decide whether there exists a suspicious activity or not. For instance, some approaches may consider individual connections as DoS attacks, while security experts might not treat them as malicious by themselves. Thus, a significant limitation of existing cybersecurity work is the lack of using the contextual information for predicting risks or attacks. Therefore, context-aware adaptive cybersecurity solutions could be another research issue in cybersecurity data science.

Feature engineering in cybersecurity : The efficiency and effectiveness of a machine learning-based security model has always been a major challenge due to the high volume of network data with a large number of traffic features. The large dimensionality of data has been addressed using several techniques such as principal component analysis (PCA) [ 167 ], singular value decomposition (SVD) [ 168 ] etc. In addition to low-level features in the datasets, the contextual relationships between suspicious activities might be relevant. Such contextual data can be stored in an ontology or taxonomy for further processing. Thus how to effectively select the optimal features or extract the significant features considering both the low-level features as well as the contextual features, for effective cybersecurity solutions could be another research issue in cybersecurity data science.

Remarkable security alert generation and prioritizing : In many cases, the cybersecurity system may not be well defined and may cause a substantial number of false alarms that are unexpected in an intelligent system. For instance, an IDS deployed in a real-world network generates around nine million alerts per day [ 169 ]. A network-based intrusion detection system typically looks at the incoming traffic for matching the associated patterns to detect risks, threats or vulnerabilities and generate security alerts. However, to respond to each such alert might not be effective as it consumes relatively huge amounts of time and resources, and consequently may result in a self-inflicted DoS. To overcome this problem, a high-level management is required that correlate the security alerts considering the current context and their logical relationship including their prioritization before reporting them to users, which could be another research issue in cybersecurity data science.

Recency analysis in cybersecurity solutions : Machine learning-based security models typically use a large amount of static data to generate data-driven decisions. Anomaly detection systems rely on constructing such a model considering normal behavior and anomaly, according to their patterns. However, normal behavior in a large and dynamic security system is not well defined and it may change over time, which can be considered as an incremental growing of dataset. The patterns in incremental datasets might be changed in several cases. This often results in a substantial number of false alarms known as false positives. Thus, a recent malicious behavioral pattern is more likely to be interesting and significant than older ones for predicting unknown attacks. Therefore, effectively using the concept of recency analysis [ 170 ] in cybersecurity solutions could be another issue in cybersecurity data science.

The most important work for an intelligent cybersecurity system is to develop an effective framework that supports data-driven decision making. In such a framework, we need to consider advanced data analysis based on machine learning techniques, so that the framework is capable to minimize these issues and to provide automated and intelligent security services. Thus, a well-designed security framework for cybersecurity data and the experimental evaluation is a very important direction and a big challenge as well. In the next section, we suggest and discuss a data-driven cybersecurity framework based on machine learning techniques considering multiple processing layers.

A multi-layered framework for smart cybersecurity services

As discussed earlier, cybersecurity data science is data-focused, applies machine learning methods, attempts to quantify cyber risks, promotes inferential techniques to analyze behavioral patterns, focuses on generating security response alerts, and eventually seeks for optimizing cybersecurity operations. Hence, we briefly discuss a multiple data processing layered framework that potentially can be used to discover security insights from the raw data to build smart cybersecurity systems, e.g., dynamic policy rule-based access control or intrusion detection and prevention system. To make a data-driven intelligent decision in the resultant cybersecurity system, understanding the security problems and the nature of corresponding security data and their vast analysis is needed. For this purpose, our suggested framework not only considers the machine learning techniques to build the security model but also takes into account the incremental learning and dynamism to keep the model up-to-date and corresponding response generation, which could be more effective and intelligent for providing the expected services. Figure 3 shows an overview of the framework, involving several processing layers, from raw security event data to services. In the following, we briefly discuss the working procedure of the framework.

figure 3

A generic multi-layered framework based on machine learning techniques for smart cybersecurity services

Security data collecting

Collecting valuable cybersecurity data is a crucial step, which forms a connecting link between security problems in cyberinfrastructure and corresponding data-driven solution steps in this framework, shown in Fig.  3 . The reason is that cyber data can serve as the source for setting up ground truth of the security model that affect the model performance. The quality and quantity of cyber data decide the feasibility and effectiveness of solving the security problem according to our goal. Thus, the concern is how to collect valuable and unique needs data for building the data-driven security models.

The general step to collect and manage security data from diverse data sources is based on a particular security problem and project within the enterprise. Data sources can be classified into several broad categories such as network, host, and hybrid [ 171 ]. Within the network infrastructure, the security system can leverage different types of security data such as IDS logs, firewall logs, network traffic data, packet data, and honeypot data, etc. for providing the target security services. For instance, a given IP is considered malicious or not, could be detected by performing data analysis utilizing the data of IP addresses and their cyber activities. In the domain of cybersecurity, the network source mentioned above is considered as the primary security event source to analyze. In the host category, it collects data from an organization’s host machines, where the data sources can be operating system logs, database access logs, web server logs, email logs, application logs, etc. Collecting data from both the network and host machines are considered a hybrid category. Overall, in a data collection layer the network activity, database activity, application activity, and user activity can be the possible security event sources in the context of cybersecurity data science.

Security data preparing

After collecting the raw security data from various sources according to the problem domain discussed above, this layer is responsible to prepare the raw data for building the model by applying various necessary processes. However, not all of the collected data contributes to the model building process in the domain of cybersecurity [ 172 ]. Therefore, the useless data should be removed from the rest of the data captured by the network sniffer. Moreover, data might be noisy, have missing or corrupted values, or have attributes of widely varying types and scales. High quality of data is necessary for achieving higher accuracy in a data-driven model, which is a process of learning a function that maps an input to an output based on example input-output pairs. Thus, it might require a procedure for data cleaning, handling missing or corrupted values. Moreover, security data features or attributes can be in different types, such as continuous, discrete, or symbolic [ 106 ]. Beyond a solid understanding of these types of data and attributes and their permissible operations, its need to preprocess the data and attributes to convert into the target type. Besides, the raw data can be in different types such as structured, semi-structured, or unstructured, etc. Thus, normalization, transformation, or collation can be useful to organize the data in a structured manner. In some cases, natural language processing techniques might be useful depending on data type and characteristics, e.g., textual contents. As both the quality and quantity of data decide the feasibility of solving the security problem, effectively pre-processing and management of data and their representation can play a significant role to build an effective security model for intelligent services.

Machine learning-based security modeling

This is the core step where insights and knowledge are extracted from data through the application of cybersecurity data science. In this section, we particularly focus on machine learning-based modeling as machine learning techniques can significantly change the cybersecurity landscape. The security features or attributes and their patterns in data are of high interest to be discovered and analyzed to extract security insights. To achieve the goal, a deeper understanding of data and machine learning-based analytical models utilizing a large number of cybersecurity data can be effective. Thus, various machine learning tasks can be involved in this model building layer according to the solution perspective. These are - security feature engineering that mainly responsible to transform raw security data into informative features that effectively represent the underlying security problem to the data-driven models. Thus, several data-processing tasks such as feature transformation and normalization, feature selection by taking into account a subset of available security features according to their correlations or importance in modeling, or feature generation and extraction by creating new brand principal components, may be involved in this module according to the security data characteristics. For instance, the chi-squared test, analysis of variance test, correlation coefficient analysis, feature importance, as well as discriminant and principal component analysis, or singular value decomposition, etc. can be used for analyzing the significance of the security features to perform the security feature engineering tasks [ 82 ].

Another significant module is security data clustering that uncovers hidden patterns and structures through huge volumes of security data, to identify where the new threats exist. It typically involves the grouping of security data with similar characteristics, which can be used to solve several cybersecurity problems such as detecting anomalies, policy violations, etc. Malicious behavior or anomaly detection module is typically responsible to identify a deviation to a known behavior, where clustering-based analysis and techniques can also be used to detect malicious behavior or anomaly detection. In the cybersecurity area, attack classification or prediction is treated as one of the most significant modules, which is responsible to build a prediction model to classify attacks or threats and to predict future for a particular security problem. To predict denial-of-service attack or a spam filter separating tasks from other messages, could be the relevant examples. Association learning or policy rule generation module can play a role to build an expert security system that comprises several IF-THEN rules that define attacks. Thus, in a problem of policy rule generation for rule-based access control system, association learning can be used as it discovers the associations or relationships among a set of available security features in a given security dataset. The popular machine learning algorithms in these categories are briefly discussed in “  Machine learning tasks in cybersecurity ” section. The module model selection or customization is responsible to choose whether it uses the existing machine learning model or needed to customize. Analyzing data and building models based on traditional machine learning or deep learning methods, could achieve acceptable results in certain cases in the domain of cybersecurity. However, in terms of effectiveness and efficiency or other performance measurements considering time complexity, generalization capacity, and most importantly the impact of the algorithm on the detection rate of a system, machine learning models are needed to customize for a specific security problem. Moreover, customizing the related techniques and data could improve the performance of the resultant security model and make it better applicable in a cybersecurity domain. The modules discussed above can work separately and combinedly depending on the target security problems.

Incremental learning and dynamism

In our framework, this layer is concerned with finalizing the resultant security model by incorporating additional intelligence according to the needs. This could be possible by further processing in several modules. For instance, the post-processing and improvement module in this layer could play a role to simplify the extracted knowledge according to the particular requirements by incorporating domain-specific knowledge. As the attack classification or prediction models based on machine learning techniques strongly rely on the training data, it can hardly be generalized to other datasets, which could be significant for some applications. To address such kind of limitations, this module is responsible to utilize the domain knowledge in the form of taxonomy or ontology to improve attack correlation in cybersecurity applications.

Another significant module recency mining and updating security model is responsible to keep the security model up-to-date for better performance by extracting the latest data-driven security patterns. The extracted knowledge discussed in the earlier layer is based on a static initial dataset considering the overall patterns in the datasets. However, such knowledge might not be guaranteed higher performance in several cases, because of incremental security data with recent patterns. In many cases, such incremental data may contain different patterns which could conflict with existing knowledge. Thus, the concept of RecencyMiner [ 170 ] on incremental security data and extracting new patterns can be more effective than the existing old patterns. The reason is that recent security patterns and rules are more likely to be significant than older ones for predicting cyber risks or attacks. Rather than processing the whole security data again, recency-based dynamic updating according to the new patterns would be more efficient in terms of processing and outcome. This could make the resultant cybersecurity model intelligent and dynamic. Finally, response planning and decision making module is responsible to make decisions based on the extracted insights and take necessary actions to prevent the system from the cyber-attacks to provide automated and intelligent services. The services might be different depending on particular requirements for a given security problem.

Overall, this framework is a generic description which potentially can be used to discover useful insights from security data, to build smart cybersecurity systems, to address complex security challenges, such as intrusion detection, access control management, detecting anomalies and fraud, or denial of service attacks, etc. in the area of cybersecurity data science.

Although several research efforts have been directed towards cybersecurity solutions, discussed in “ Background ” , “ Cybersecurity data science ”, and “ Machine learning tasks in cybersecurity ” sections in different directions, this paper presents a comprehensive view of cybersecurity data science. For this, we have conducted a literature review to understand cybersecurity data, various defense strategies including intrusion detection techniques, different types of machine learning techniques in cybersecurity tasks. Based on our discussion on existing work, several research issues related to security datasets, data quality problems, policy rule generation, learning methods, data protection, feature engineering, security alert generation, recency analysis etc. are identified that require further research attention in the domain of cybersecurity data science.

The scope of cybersecurity data science is broad. Several data-driven tasks such as intrusion detection and prevention, access control management, security policy generation, anomaly detection, spam filtering, fraud detection and prevention, various types of malware attack detection and defense strategies, etc. can be considered as the scope of cybersecurity data science. Such tasks based categorization could be helpful for security professionals including the researchers and practitioners who are interested in the domain-specific aspects of security systems [ 171 ]. The output of cybersecurity data science can be used in many application areas such as Internet of things (IoT) security [ 173 ], network security [ 174 ], cloud security [ 175 ], mobile and web applications [ 26 ], and other relevant cyber areas. Moreover, intelligent cybersecurity solutions are important for the banking industry, the healthcare sector, or the public sector, where data breaches typically occur [ 36 , 176 ]. Besides, the data-driven security solutions could also be effective in AI-based blockchain technology, where AI works with huge volumes of security event data to extract the useful insights using machine learning techniques, and block-chain as a trusted platform to store such data [ 177 ].

Although in this paper, we discuss cybersecurity data science focusing on examining raw security data to data-driven decision making for intelligent security solutions, it could also be related to big data analytics in terms of data processing and decision making. Big data deals with data sets that are too large or complex having characteristics of high data volume, velocity, and variety. Big data analytics mainly has two parts consisting of data management involving data storage, and analytics [ 178 ]. The analytics typically describe the process of analyzing such datasets to discover patterns, unknown correlations, rules, and other useful insights [ 179 ]. Thus, several advanced data analysis techniques such as AI, data mining, machine learning could play an important role in processing big data by converting big problems to small problems [ 180 ]. To do this, the potential strategies like parallelization, divide-and-conquer, incremental learning, sampling, granular computing, feature or instance selection, can be used to make better decisions, reducing costs, or enabling more efficient processing. In such cases, the concept of cybersecurity data science, particularly machine learning-based modeling could be helpful for process automation and decision making for intelligent security solutions. Moreover, researchers could consider modified algorithms or models for handing big data on parallel computing platforms like Hadoop, Storm, etc. [ 181 ].

Based on the concept of cybersecurity data science discussed in the paper, building a data-driven security model for a particular security problem and relevant empirical evaluation to measure the effectiveness and efficiency of the model, and to asses the usability in the real-world application domain could be a future work.

Motivated by the growing significance of cybersecurity and data science, and machine learning technologies, in this paper, we have discussed how cybersecurity data science applies to data-driven intelligent decision making in smart cybersecurity systems and services. We also have discussed how it can impact security data, both in terms of extracting insight of security incidents and the dataset itself. We aimed to work on cybersecurity data science by discussing the state of the art concerning security incidents data and corresponding security services. We also discussed how machine learning techniques can impact in the domain of cybersecurity, and examine the security challenges that remain. In terms of existing research, much focus has been provided on traditional security solutions, with less available work in machine learning technique based security systems. For each common technique, we have discussed relevant security research. The purpose of this article is to share an overview of the conceptualization, understanding, modeling, and thinking about cybersecurity data science.

We have further identified and discussed various key issues in security analysis to showcase the signpost of future research directions in the domain of cybersecurity data science. Based on the knowledge, we have also provided a generic multi-layered framework of cybersecurity data science model based on machine learning techniques, where the data is being gathered from diverse sources, and the analytics complement the latest data-driven patterns for providing intelligent security services. The framework consists of several main phases - security data collecting, data preparation, machine learning-based security modeling, and incremental learning and dynamism for smart cybersecurity systems and services. We specifically focused on extracting insights from security data, from setting a research design with particular attention to concepts for data-driven intelligent security solutions.

Overall, this paper aimed not only to discuss cybersecurity data science and relevant methods but also to discuss the applicability towards data-driven intelligent decision making in cybersecurity systems and services from machine learning perspectives. Our analysis and discussion can have several implications both for security researchers and practitioners. For researchers, we have highlighted several issues and directions for future research. Other areas for potential research include empirical evaluation of the suggested data-driven model, and comparative analysis with other security systems. For practitioners, the multi-layered machine learning-based model can be used as a reference in designing intelligent cybersecurity systems for organizations. We believe that our study on cybersecurity data science opens a promising path and can be used as a reference guide for both academia and industry for future research and applications in the area of cybersecurity.

Availability of data and materials

Not applicable.

Abbreviations

  • Machine learning

Artificial Intelligence

Information and communication technology

Internet of Things

Distributed Denial of Service

Intrusion detection system

Intrusion prevention system

Host-based intrusion detection systems

Network Intrusion Detection Systems

Signature-based intrusion detection system

Anomaly-based intrusion detection system

Li S, Da Xu L, Zhao S. The internet of things: a survey. Inform Syst Front. 2015;17(2):243–59.

Google Scholar  

Sun N, Zhang J, Rimba P, Gao S, Zhang LY, Xiang Y. Data-driven cybersecurity incident prediction: a survey. IEEE Commun Surv Tutor. 2018;21(2):1744–72.

McIntosh T, Jang-Jaccard J, Watters P, Susnjak T. The inadequacy of entropy-based ransomware detection. In: International conference on neural information processing. New York: Springer; 2019. p. 181–189

Alazab M, Venkatraman S, Watters P, Alazab M, et al. Zero-day malware detection based on supervised learning algorithms of api call signatures (2010)

Shaw A. Data breach: from notification to prevention using pci dss. Colum Soc Probs. 2009;43:517.

Gupta BB, Tewari A, Jain AK, Agrawal DP. Fighting against phishing attacks: state of the art and future challenges. Neural Comput Appl. 2017;28(12):3629–54.

Av-test institute, germany, https://www.av-test.org/en/statistics/malware/ . Accessed 20 Oct 2019.

Ibm security report, https://www.ibm.com/security/data-breach . Accessed on 20 Oct 2019.

Fischer EA. Cybersecurity issues and challenges: In brief. Congressional Research Service (2014)

Juniper research. https://www.juniperresearch.com/ . Accessed on 20 Oct 2019.

Papastergiou S, Mouratidis H, Kalogeraki E-M. Cyber security incident handling, warning and response system for the european critical information infrastructures (cybersane). In: International Conference on Engineering Applications of Neural Networks, p. 476–487 (2019). New York: Springer

Aftergood S. Cybersecurity: the cold war online. Nature. 2017;547(7661):30.

Hey AJ, Tansley S, Tolle KM, et al. The fourth paradigm: data-intensive scientific discovery. 2009;1:

Cukier K. Data, data everywhere: A special report on managing information, 2010.

Google trends. In: https://trends.google.com/trends/ , 2019.

Anwar S, Mohamad Zain J, Zolkipli MF, Inayat Z, Khan S, Anthony B, Chang V. From intrusion detection to an intrusion response system: fundamentals, requirements, and future directions. Algorithms. 2017;10(2):39.

MATH   Google Scholar  

Mohammadi S, Mirvaziri H, Ghazizadeh-Ahsaee M, Karimipour H. Cyber intrusion detection by combined feature selection algorithm. J Inform Sec Appl. 2019;44:80–8.

Tapiador JE, Orfila A, Ribagorda A, Ramos B. Key-recovery attacks on kids, a keyed anomaly detection system. IEEE Trans Depend Sec Comput. 2013;12(3):312–25.

Tavallaee M, Stakhanova N, Ghorbani AA. Toward credible evaluation of anomaly-based intrusion-detection methods. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 40(5), 516–524 (2010)

Foroughi F, Luksch P. Data science methodology for cybersecurity projects. arXiv preprint arXiv:1803.04219 , 2018.

Saxe J, Sanders H. Malware data science: Attack detection and attribution, 2018.

Rainie L, Anderson J, Connolly J. Cyber attacks likely to increase. Digital Life in. 2014, vol. 2025.

Fischer EA. Creating a national framework for cybersecurity: an analysis of issues and options. LIBRARY OF CONGRESS WASHINGTON DC CONGRESSIONAL RESEARCH SERVICE, 2005.

Craigen D, Diakun-Thibault N, Purse R. Defining cybersecurity. Technology Innovation. Manag Rev. 2014;4(10):13–21.

Council NR. et al. Toward a safer and more secure cyberspace, 2007.

Jang-Jaccard J, Nepal S. A survey of emerging threats in cybersecurity. J Comput Syst Sci. 2014;80(5):973–93.

MathSciNet   MATH   Google Scholar  

Mukkamala S, Sung A, Abraham A. Cyber security challenges: Designing efficient intrusion detection systems and antivirus tools. Vemuri, V. Rao, Enhancing Computer Security with Smart Technology.(Auerbach, 2006), 125–163, 2005.

Bilge L, Dumitraş T. Before we knew it: an empirical study of zero-day attacks in the real world. In: Proceedings of the 2012 ACM conference on computer and communications security. ACM; 2012. p. 833–44.

Davi L, Dmitrienko A, Sadeghi A-R, Winandy M. Privilege escalation attacks on android. In: International conference on information security. New York: Springer; 2010. p. 346–60.

Jovičić B, Simić D. Common web application attack types and security using asp .net. ComSIS, 2006.

Warkentin M, Willison R. Behavioral and policy issues in information systems security: the insider threat. Eur J Inform Syst. 2009;18(2):101–5.

Kügler D. “man in the middle” attacks on bluetooth. In: International Conference on Financial Cryptography. New York: Springer; 2003, p. 149–61.

Virvilis N, Gritzalis D. The big four-what we did wrong in advanced persistent threat detection. In: 2013 International Conference on Availability, Reliability and Security. IEEE; 2013. p. 248–54.

Boyd SW, Keromytis AD. Sqlrand: Preventing sql injection attacks. In: International conference on applied cryptography and network security. New York: Springer; 2004. p. 292–302.

Sigler K. Crypto-jacking: how cyber-criminals are exploiting the crypto-currency boom. Comput Fraud Sec. 2018;2018(9):12–4.

2019 data breach investigations report, https://enterprise.verizon.com/resources/reports/dbir/ . Accessed 20 Oct 2019.

Khraisat A, Gondal I, Vamplew P, Kamruzzaman J. Survey of intrusion detection systems: techniques, datasets and challenges. Cybersecurity. 2019;2(1):20.

Johnson L. Computer incident response and forensics team management: conducting a successful incident response, 2013.

Brahmi I, Brahmi H, Yahia SB. A multi-agents intrusion detection system using ontology and clustering techniques. In: IFIP international conference on computer science and its applications. New York: Springer; 2015. p. 381–93.

Qu X, Yang L, Guo K, Ma L, Sun M, Ke M, Li M. A survey on the development of self-organizing maps for unsupervised intrusion detection. In: Mobile networks and applications. 2019;1–22.

Liao H-J, Lin C-HR, Lin Y-C, Tung K-Y. Intrusion detection system: a comprehensive review. J Netw Comput Appl. 2013;36(1):16–24.

Alazab A, Hobbs M, Abawajy J, Alazab M. Using feature selection for intrusion detection system. In: 2012 International symposium on communications and information technologies (ISCIT). IEEE; 2012. p. 296–301.

Viegas E, Santin AO, Franca A, Jasinski R, Pedroni VA, Oliveira LS. Towards an energy-efficient anomaly-based intrusion detection engine for embedded systems. IEEE Trans Comput. 2016;66(1):163–77.

Xin Y, Kong L, Liu Z, Chen Y, Li Y, Zhu H, Gao M, Hou H, Wang C. Machine learning and deep learning methods for cybersecurity. IEEE Access. 2018;6:35365–81.

Dutt I, Borah S, Maitra IK, Bhowmik K, Maity A, Das S. Real-time hybrid intrusion detection system using machine learning techniques. 2018, p. 885–94.

Ragsdale DJ, Carver C, Humphries JW, Pooch UW. Adaptation techniques for intrusion detection and intrusion response systems. In: Smc 2000 conference proceedings. 2000 IEEE international conference on systems, man and cybernetics.’cybernetics evolving to systems, humans, organizations, and their complex interactions’(cat. No. 0). IEEE; 2000. vol. 4, p. 2344–2349.

Cao L. Data science: challenges and directions. Commun ACM. 2017;60(8):59–68.

Rizk A, Elragal A. Data science: developing theoretical contributions in information systems via text analytics. J Big Data. 2020;7(1):1–26.

Lippmann RP, Fried DJ, Graf I, Haines JW, Kendall KR, McClung D, Weber D, Webster SE, Wyschogrod D, Cunningham RK, et al. Evaluating intrusion detection systems: The 1998 darpa off-line intrusion detection evaluation. In: Proceedings DARPA information survivability conference and exposition. DISCEX’00. IEEE; 2000. vol. 2, p. 12–26.

Kdd cup 99. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html . Accessed 20 Oct 2019.

Tavallaee M, Bagheri E, Lu W, Ghorbani AA. A detailed analysis of the kdd cup 99 data set. In: 2009 IEEE symposium on computational intelligence for security and defense applications. IEEE; 2009. p. 1–6.

Caida ddos attack 2007 dataset. http://www.caida.org/data/ passive/ddos-20070804-dataset.xml/ . Accessed 20 Oct 2019.

Caida anonymized internet traces 2008 dataset. https://www.caida.org/data/passive/passive-2008-dataset . Accessed 20 Oct 2019.

Isot botnet dataset. https://www.uvic.ca/engineering/ece/isot/ datasets/index.php/ . Accessed 20 Oct 2019.

The honeynet project. http://www.honeynet.org/chapters/france/ . Accessed 20 Oct 2019.

Canadian institute of cybersecurity, university of new brunswick, iscx dataset, http://www.unb.ca/cic/datasets/index.html/ . Accessed 20 Oct 2019.

Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA. Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput Secur. 2012;31(3):357–74.

The ctu-13 dataset. https://stratosphereips.org/category/datasets-ctu13 . Accessed 20 Oct 2019.

Moustafa N, Slay J. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In: 2015 Military Communications and Information Systems Conference (MilCIS). IEEE; 2015. p. 1–6.

Cse-cic-ids2018 [online]. available: https://www.unb.ca/cic/ datasets/ids-2018.html/ . Accessed 20 Oct 2019.

Cic-ddos2019 [online]. available: https://www.unb.ca/cic/datasets/ddos-2019.html/ . Accessed 28 Mar 2019.

Jing X, Yan Z, Jiang X, Pedrycz W. Network traffic fusion and analysis against ddos flooding attacks with a novel reversible sketch. Inform Fusion. 2019;51:100–13.

Xie M, Hu J, Yu X, Chang E. Evaluating host-based anomaly detection systems: application of the frequency-based algorithms to adfa-ld. In: International conference on network and system security. New York: Springer; 2015. p. 542–49.

Lindauer B, Glasser J, Rosen M, Wallnau KC, ExactData L. Generating test data for insider threat detectors. JoWUA. 2014;5(2):80–94.

Glasser J, Lindauer B. Bridging the gap: A pragmatic approach to generating insider threat data. In: 2013 IEEE Security and Privacy Workshops. IEEE; 2013. p. 98–104.

Enronspam. https://labs-repos.iit.demokritos.gr/skel/i-config/downloads/enron-spam/ . Accessed 20 Oct 2019.

Spamassassin. http://www.spamassassin.org/publiccorpus/ . Accessed 20 Oct 2019.

Lingspam. https://labs-repos.iit.demokritos.gr/skel/i-config/downloads/lingspampublic.tar.gz/ . Accessed 20 Oct 2019.

Alexa top sites. https://aws.amazon.com/alexa-top-sites/ . Accessed 20 Oct 2019.

Bambenek consulting—master feeds. available online: http://osint.bambenekconsulting.com/feeds/ . Accessed 20 Oct 2019.

Dgarchive. https://dgarchive.caad.fkie.fraunhofer.de/site/ . Accessed 20 Oct 2019.

Zago M, Pérez MG, Pérez GM. Umudga: A dataset for profiling algorithmically generated domain names in botnet detection. Data in Brief. 2020;105400.

Zhou Y, Jiang X. Dissecting android malware: characterization and evolution. In: 2012 IEEE Symposium on security and privacy. IEEE; 2012. p. 95–109.

Virusshare. http://virusshare.com/ . Accessed 20 Oct 2019.

Virustotal. https://virustotal.com/ . Accessed 20 Oct 2019.

Comodo. https://www.comodo.com/home/internet-security/updates/vdp/database . Accessed 20 Oct 2019.

Contagio. http://contagiodump.blogspot.com/ . Accessed 20 Oct 2019.

Kumar R, Xiaosong Z, Khan RU, Kumar J, Ahad I. Effective and explainable detection of android malware based on machine learning algorithms. In: Proceedings of the 2018 international conference on computing and artificial intelligence. ACM; 2018. p. 35–40.

Microsoft malware classification (big 2015). arXiv:org/abs/1802.10135/ . Accessed 20 Oct 2019.

Koroniotis N, Moustafa N, Sitnikova E, Turnbull B. Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: bot-iot dataset. Future Gen Comput Syst. 2019;100:779–96.

McIntosh TR, Jang-Jaccard J, Watters PA. Large scale behavioral analysis of ransomware attacks. In: International conference on neural information processing. New York: Springer; 2018. p. 217–29.

Han J, Pei J, Kamber M. Data mining: concepts and techniques, 2011.

Witten IH, Frank E. Data mining: Practical machine learning tools and techniques, 2005.

Dua S, Du X. Data mining and machine learning in cybersecurity, 2016.

Kotpalliwar MV, Wajgi R. Classification of attacks using support vector machine (svm) on kddcup’99 ids database. In: 2015 Fifth international conference on communication systems and network technologies. IEEE; 2015. p. 987–90.

Pervez MS, Farid DM. Feature selection and intrusion classification in nsl-kdd cup 99 dataset employing svms. In: The 8th international conference on software, knowledge, information management and applications (SKIMA 2014). IEEE; 2014. p. 1–6.

Yan M, Liu Z. A new method of transductive svm-based network intrusion detection. In: International conference on computer and computing technologies in agriculture. New York: Springer; 2010. p. 87–95.

Li Y, Xia J, Zhang S, Yan J, Ai X, Dai K. An efficient intrusion detection system based on support vector machines and gradually feature removal method. Expert Syst Appl. 2012;39(1):424–30.

Raman MG, Somu N, Jagarapu S, Manghnani T, Selvam T, Krithivasan K, Sriram VS. An efficient intrusion detection technique based on support vector machine and improved binary gravitational search algorithm. Artificial Intelligence Review. 2019, p. 1–32.

Kokila R, Selvi ST, Govindarajan K. Ddos detection and analysis in sdn-based environment using support vector machine classifier. In: 2014 Sixth international conference on advanced computing (ICoAC). IEEE; 2014. p. 205–10.

Xie M, Hu J, Slay J. Evaluating host-based anomaly detection systems: Application of the one-class svm algorithm to adfa-ld. In: 2014 11th international conference on fuzzy systems and knowledge discovery (FSKD). IEEE; 2014. p. 978–82.

Saxena H, Richariya V. Intrusion detection in kdd99 dataset using svm-pso and feature reduction with information gain. Int J Comput Appl. 2014;98:6.

Chandrasekhar A, Raghuveer K. Confederation of fcm clustering, ann and svm techniques to implement hybrid nids using corrected kdd cup 99 dataset. In: 2014 international conference on communication and signal processing. IEEE; 2014. p. 672–76.

Shapoorifard H, Shamsinejad P. Intrusion detection using a novel hybrid method incorporating an improved knn. Int J Comput Appl. 2017;173(1):5–9.

Vishwakarma S, Sharma V, Tiwari A. An intrusion detection system using knn-aco algorithm. Int J Comput Appl. 2017;171(10):18–23.

Meng W, Li W, Kwok L-F. Design of intelligent knn-based alarm filter using knowledge-based alert verification in intrusion detection. Secur Commun Netw. 2015;8(18):3883–95.

Dada E. A hybridized svm-knn-pdapso approach to intrusion detection system. In: Proc. Fac. Seminar Ser., 2017, p. 14–21.

Sharifi AM, Amirgholipour SK, Pourebrahimi A. Intrusion detection based on joint of k-means and knn. J Converg Inform Technol. 2015;10(5):42.

Lin W-C, Ke S-W, Tsai C-F. Cann: an intrusion detection system based on combining cluster centers and nearest neighbors. Knowl Based Syst. 2015;78:13–21.

Koc L, Mazzuchi TA, Sarkani S. A network intrusion detection system based on a hidden naïve bayes multiclass classifier. Exp Syst Appl. 2012;39(18):13492–500.

Moon D, Im H, Kim I, Park JH. Dtb-ids: an intrusion detection system based on decision tree using behavior analysis for preventing apt attacks. J Supercomput. 2017;73(7):2881–95.

Ingre, B., Yadav, A., Soni, A.K.: Decision tree based intrusion detection system for nsl-kdd dataset. In: International conference on information and communication technology for intelligent systems. New York: Springer; 2017. p. 207–18.

Malik AJ, Khan FA. A hybrid technique using binary particle swarm optimization and decision tree pruning for network intrusion detection. Cluster Comput. 2018;21(1):667–80.

Relan NG, Patil DR. Implementation of network intrusion detection system using variant of decision tree algorithm. In: 2015 international conference on nascent technologies in the engineering field (ICNTE). IEEE; 2015. p. 1–5.

Rai K, Devi MS, Guleria A. Decision tree based algorithm for intrusion detection. Int J Adv Netw Appl. 2016;7(4):2828.

Sarker IH, Abushark YB, Alsolami F, Khan AI. Intrudtree: a machine learning based cyber security intrusion detection model. Symmetry. 2020;12(5):754.

Puthran S, Shah K. Intrusion detection using improved decision tree algorithm with binary and quad split. In: International symposium on security in computing and communication. New York: Springer; 2016. p. 427–438.

Balogun AO, Jimoh RG. Anomaly intrusion detection using an hybrid of decision tree and k-nearest neighbor, 2015.

Azad C, Jha VK. Genetic algorithm to solve the problem of small disjunct in the decision tree based intrusion detection system. Int J Comput Netw Inform Secur. 2015;7(8):56.

Jo S, Sung H, Ahn B. A comparative study on the performance of intrusion detection using decision tree and artificial neural network models. J Korea Soc Dig Indus Inform Manag. 2015;11(4):33–45.

Zhan J, Zulkernine M, Haque A. Random-forests-based network intrusion detection systems. IEEE Trans Syst Man Cybern C. 2008;38(5):649–59.

Tajbakhsh A, Rahmati M, Mirzaei A. Intrusion detection using fuzzy association rules. Appl Soft Comput. 2009;9(2):462–9.

Mitchell R, Chen R. Behavior rule specification-based intrusion detection for safety critical medical cyber physical systems. IEEE Trans Depend Secure Comput. 2014;12(1):16–30.

Alazab M, Venkataraman S, Watters P. Towards understanding malware behaviour by the extraction of api calls. In: 2010 second cybercrime and trustworthy computing Workshop. IEEE; 2010. p. 52–59.

Yuan Y, Kaklamanos G, Hogrefe D. A novel semi-supervised adaboost technique for network anomaly detection. In: Proceedings of the 19th ACM international conference on modeling, analysis and simulation of wireless and mobile systems. ACM; 2016. p. 111–14.

Ariu D, Tronci R, Giacinto G. Hmmpayl: an intrusion detection system based on hidden markov models. Comput Secur. 2011;30(4):221–41.

Årnes A, Valeur F, Vigna G, Kemmerer RA. Using hidden markov models to evaluate the risks of intrusions. In: International workshop on recent advances in intrusion detection. New York: Springer; 2006. p. 145–64.

Hansen JV, Lowry PB, Meservy RD, McDonald DM. Genetic programming for prevention of cyberterrorism through dynamic and evolving intrusion detection. Decis Supp Syst. 2007;43(4):1362–74.

Aslahi-Shahri B, Rahmani R, Chizari M, Maralani A, Eslami M, Golkar MJ, Ebrahimi A. A hybrid method consisting of ga and svm for intrusion detection system. Neural Comput Appl. 2016;27(6):1669–76.

Alrawashdeh K, Purdy C. Toward an online anomaly intrusion detection system based on deep learning. In: 2016 15th IEEE international conference on machine learning and applications (ICMLA). IEEE; 2016. p. 195–200.

Yin C, Zhu Y, Fei J, He X. A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access. 2017;5:21954–61.

Kim J, Kim J, Thu HLT, Kim H. Long short term memory recurrent neural network classifier for intrusion detection. In: 2016 international conference on platform technology and service (PlatCon). IEEE; 2016. p. 1–5.

Almiani M, AbuGhazleh A, Al-Rahayfeh A, Atiewi S, Razaque A. Deep recurrent neural network for iot intrusion detection system. Simulation Modelling Practice and Theory. 2019;102031.

Kolosnjaji B, Zarras A, Webster G, Eckert C. Deep learning for classification of malware system call sequences. In: Australasian joint conference on artificial intelligence. New York: Springer; 2016. p. 137–49.

Wang W, Zhu M, Zeng X, Ye X, Sheng Y. Malware traffic classification using convolutional neural network for representation learning. In: 2017 international conference on information networking (ICOIN). IEEE; 2017. p. 712–17.

Alauthman M, Aslam N, Al-kasassbeh M, Khan S, Al-Qerem A, Choo K-KR. An efficient reinforcement learning-based botnet detection approach. J Netw Comput Appl. 2020;150:102479.

Blanco R, Cilla JJ, Briongos S, Malagón P, Moya JM. Applying cost-sensitive classifiers with reinforcement learning to ids. In: International conference on intelligent data engineering and automated learning. New York: Springer; 2018. p. 531–38.

Lopez-Martin M, Carro B, Sanchez-Esguevillas A. Application of deep reinforcement learning to intrusion detection for supervised problems. Exp Syst Appl. 2020;141:112963.

Sarker IH, Kayes A, Watters P. Effectiveness analysis of machine learning classification models for predicting personalized context-aware smartphone usage. J Big Data. 2019;6(1):1–28.

Holte RC. Very simple classification rules perform well on most commonly used datasets. Mach Learn. 1993;11(1):63–90.

John GH, Langley P. Estimating continuous distributions in bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc.; 1995. p. 338–45.

Quinlan JR. C4.5: Programs for machine learning. Machine Learning, 1993.

Sarker IH, Colman A, Han J, Khan AI, Abushark YB, Salah K. Behavdt: a behavioral decision tree learning to build user-centric context-aware predictive model. Mobile Networks and Applications. 2019, p. 1–11.

Aha DW, Kibler D, Albert MK. Instance-based learning algorithms. Mach Learn. 1991;6(1):37–66.

Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK. Improvements to platt’s smo algorithm for svm classifier design. Neural Comput. 2001;13(3):637–49.

Freund Y, Schapire RE, et al: Experiments with a new boosting algorithm. In: Icml, vol. 96, p. 148–156 (1996). Citeseer

Le Cessie S, Van Houwelingen JC. Ridge estimators in logistic regression. J Royal Stat Soc C. 1992;41(1):191–201.

Watters PA, McCombie S, Layton R, Pieprzyk J. Characterising and predicting cyber attacks using the cyber attacker model profile (camp). J Money Launder Control. 2012.

Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

Sarker IH. Context-aware rule learning from smartphone data: survey, challenges and future directions. J Big Data. 2019;6(1):95.

MacQueen J. Some methods for classification and analysis of multivariate observations. In: Fifth Berkeley symposium on mathematical statistics and probability, vol. 1, 1967.

Rokach L. A survey of clustering algorithms. In: Data Mining and Knowledge Discovery Handbook. New York: Springer; 2010. p. 269–98.

Sneath PH. The application of computers to taxonomy. J Gen Microbiol. 1957;17:1.

Sorensen T. method of establishing groups of equal amplitude in plant sociology based on similarity of species. Biol Skr. 1948;5.

Sarker IH, Colman A, Kabir MA, Han J. Individualized time-series segmentation for mining mobile phone user behavior. Comput J. 2018;61(3):349–68.

Kim G, Lee S, Kim S. A novel hybrid intrusion detection method integrating anomaly detection with misuse detection. Exp Syst Appl. 2014;41(4):1690–700.

MathSciNet   Google Scholar  

Agrawal R, Imieliński T, Swami A. Mining association rules between sets of items in large databases. In: ACM SIGMOD Record. ACM; 1993. vol. 22, p. 207–16.

Flach PA, Lachiche N. Confirmation-guided discovery of first-order rules with tertius. Mach Learn. 2001;42(1–2):61–95.

Agrawal R, Srikant R, et al: Fast algorithms for mining association rules. In: Proc. 20th Int. Conf. Very Large Data Bases, VLDB, 1994, vol. 1215, p. 487–99.

Houtsma M, Swami A. Set-oriented mining for association rules in relational databases. In: Proceedings of the eleventh international conference on data engineering. IEEE; 1995. p. 25–33.

Ma BLWHY. Integrating classification and association rule mining. In: Proceedings of the fourth international conference on knowledge discovery and data mining, 1998.

Han J, Pei J, Yin Y. Mining frequent patterns without candidate generation. In: ACM Sigmod Record. ACM; 2000. vol. 29, p. 1–12.

Sarker IH, Salim FD. Mining user behavioral rules from smartphone data through association analysis. In: Proceedings of the 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Melbourne, Australia. New York: Springer; 2018. p. 450–61.

Das A, Ng W-K, Woon Y-K. Rapid association rule mining. In: Proceedings of the tenth international conference on information and knowledge management. ACM; 2001. p. 474–81.

Zaki MJ. Scalable algorithms for association mining. IEEE Trans Knowl Data Eng. 2000;12(3):372–90.

Coelho IM, Coelho VN, Luz EJS, Ochi LS, Guimarães FG, Rios E. A gpu deep learning metaheuristic based model for time series forecasting. Appl Energy. 2017;201:412–8.

Van Efferen L, Ali-Eldin AM. A multi-layer perceptron approach for flow-based anomaly detection. In: 2017 International symposium on networks, computers and communications (ISNCC). IEEE; 2017. p. 1–6.

Liu H, Lang B, Liu M, Yan H. Cnn and rnn based payload classification methods for attack detection. Knowl Based Syst. 2019;163:332–41.

Berman DS, Buczak AL, Chavis JS, Corbett CL. A survey of deep learning methods for cyber security. Information. 2019;10(4):122.

Bellman R. A markovian decision process. J Math Mech. 1957;1:679–84.

Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: a survey. J Artif Intell Res. 1996;4:237–85.

Sarker IH. A machine learning based robust prediction model for real-life mobile phone data. Internet of Things. 2019;5:180–93.

Kayes ASM, Han J, Colman A. OntCAAC: an ontology-based approach to context-aware access control for software services. Comput J. 2015;58(11):3000–34.

Kayes ASM, Rahayu W, Dillon T. An ontology-based approach to dynamic contextual role for pervasive access control. In: AINA 2018. IEEE Computer Society, 2018.

Colombo P, Ferrari E. Access control technologies for big data management systems: literature review and future trends. Cybersecurity. 2019;2(1):1–13.

Aleroud A, Karabatis G. Contextual information fusion for intrusion detection: a survey and taxonomy. Knowl Inform Syst. 2017;52(3):563–619.

Sarker IH, Abushark YB, Khan AI. Contextpca: Predicting context-aware smartphone apps usage based on machine learning techniques. Symmetry. 2020;12(4):499.

Madsen RE, Hansen LK, Winther O. Singular value decomposition and principal component analysis. Neural Netw. 2004;1:1–5.

Qiao L-B, Zhang B-F, Lai Z-Q, Su J-S. Mining of attack models in ids alerts from network backbone by a two-stage clustering method. In: 2012 IEEE 26th international parallel and distributed processing symposium workshops & Phd Forum. IEEE; 2012. p. 1263–9.

Sarker IH, Colman A, Han J. Recencyminer: mining recency-based personalized behavior from contextual smartphone data. J Big Data. 2019;6(1):49.

Ullah F, Babar MA. Architectural tactics for big data cybersecurity analytics systems: a review. J Syst Softw. 2019;151:81–118.

Zhao S, Leftwich K, Owens M, Magrone F, Schonemann J, Anderson B, Medhi D. I-can-mama: Integrated campus network monitoring and management. In: 2014 IEEE network operations and management symposium (NOMS). IEEE; 2014. p. 1–7.

Abomhara M, et al. Cyber security and the internet of things: vulnerabilities, threats, intruders and attacks. J Cyber Secur Mob. 2015;4(1):65–88.

Helali RGM. Data mining based network intrusion detection system: A survey. In: Novel algorithms and techniques in telecommunications and networking. New York: Springer; 2010. p. 501–505.

Ryoo J, Rizvi S, Aiken W, Kissell J. Cloud security auditing: challenges and emerging approaches. IEEE Secur Priv. 2013;12(6):68–74.

Densham B. Three cyber-security strategies to mitigate the impact of a data breach. Netw Secur. 2015;2015(1):5–8.

Salah K, Rehman MHU, Nizamuddin N, Al-Fuqaha A. Blockchain for ai: review and open research challenges. IEEE Access. 2019;7:10127–49.

Gandomi A, Haider M. Beyond the hype: big data concepts, methods, and analytics. Int J Inform Manag. 2015;35(2):137–44.

Golchha N. Big data-the information revolution. Int J Adv Res. 2015;1(12):791–4.

Hariri RH, Fredericks EM, Bowers KM. Uncertainty in big data analytics: survey, opportunities, and challenges. J Big Data. 2019;6(1):44.

Tsai C-W, Lai C-F, Chao H-C, Vasilakos AV. Big data analytics: a survey. J Big data. 2015;2(1):21.

Download references

Acknowledgements

The authors would like to thank all the reviewers for their rigorous review and comments in several revision rounds. The reviews are detailed and helpful to improve and finalize the manuscript. The authors are highly grateful to them.

Author information

Authors and affiliations.

Swinburne University of Technology, Melbourne, VIC, 3122, Australia

Iqbal H. Sarker

Chittagong University of Engineering and Technology, Chittagong, 4349, Bangladesh

La Trobe University, Melbourne, VIC, 3086, Australia

A. S. M. Kayes, Paul Watters & Alex Ng

University of Nevada, Reno, USA

Shahriar Badsha

Macquarie University, Sydney, NSW, 2109, Australia

Hamed Alqahtani

You can also search for this author in PubMed   Google Scholar

Contributions

This article provides not only a discussion on cybersecurity data science and relevant methods but also to discuss the applicability towards data-driven intelligent decision making in cybersecurity systems and services. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Iqbal H. Sarker .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Sarker, I.H., Kayes, A.S.M., Badsha, S. et al. Cybersecurity data science: an overview from machine learning perspective. J Big Data 7 , 41 (2020). https://doi.org/10.1186/s40537-020-00318-5

Download citation

Received : 26 October 2019

Accepted : 21 June 2020

Published : 01 July 2020

DOI : https://doi.org/10.1186/s40537-020-00318-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Decision making
  • Cyber-attack
  • Security modeling
  • Intrusion detection
  • Cyber threat intelligence

database security research paper

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Sensors (Basel)

Logo of sensors

The Impact of Artificial Intelligence on Data System Security: A Literature Review

Ricardo raimundo.

1 ISEC Lisboa, Instituto Superior de Educação e Ciências, 1750-142 Lisbon, Portugal; [email protected]

Albérico Rosário

2 Research Unit on Governance, Competitiveness and Public Policies (GOVCOPP), University of Aveiro, 3810-193 Aveiro, Portugal

Associated Data

Not applicable.

Diverse forms of artificial intelligence (AI) are at the forefront of triggering digital security innovations based on the threats that are arising in this post-COVID world. On the one hand, companies are experiencing difficulty in dealing with security challenges with regard to a variety of issues ranging from system openness, decision making, quality control, and web domain, to mention a few. On the other hand, in the last decade, research has focused on security capabilities based on tools such as platform complacency, intelligent trees, modeling methods, and outage management systems in an effort to understand the interplay between AI and those issues. the dependence on the emergence of AI in running industries and shaping the education, transports, and health sectors is now well known in the literature. AI is increasingly employed in managing data security across economic sectors. Thus, a literature review of AI and system security within the current digital society is opportune. This paper aims at identifying research trends in the field through a systematic bibliometric literature review (LRSB) of research on AI and system security. the review entails 77 articles published in the Scopus ® database, presenting up-to-date knowledge on the topic. the LRSB results were synthesized across current research subthemes. Findings are presented. the originality of the paper relies on its LRSB method, together with an extant review of articles that have not been categorized so far. Implications for future research are suggested.

1. Introduction

The assumption that the human brain may be deemed quite comparable to computers in some ways offers the spontaneous basis for artificial intelligence (AI), which is supported by psychology through the idea of humans and animals operating like machines that process information by devices of associative memory [ 1 ]. Nowadays, researchers are working on the possibilities of AI to cope with varying issues of systems security across diverse sectors. Hence, AI is commonly considered an interdisciplinary research area that attracts considerable attention both in economics and social domains as it offers a myriad of technological breakthroughs with regard to systems security [ 2 ]. There is a universal trend of investing in AI technology to face security challenges of our daily lives, such as statistical data, medicine, and transportation [ 3 ].

Some claim that specific data from key sectors have supported the development of AI, namely the availability of data from e-commerce [ 4 ], businesses [ 5 ], and government [ 6 ], which provided substantial input to ameliorate diverse machine-learning solutions and algorithms, in particular with respect to systems security [ 7 ]. Additionally, China and Russia have acknowledged the importance of AI for systems security and competitiveness in general [ 8 , 9 ]. Similarly, China has recognized the importance of AI in terms of housing security, aiming at becoming an authority in the field [ 10 ]. Those efforts are already being carried out in some leading countries in order to profit the most from its substantial benefits [ 9 ]. In spite of the huge development of AI in the last few years, the discussion around the topic of systems security is sparse [ 11 ]. Therefore, it is opportune to acquaint the last developments regarding the theme in order to map the advancements in the field and ensuing outcomes [ 12 ]. In view of this, we intend to find out the principal trends of issues discussed on the topic these days in order to answer the main research question: What is the impact of AI on data system security?

The article is organized as follows. In Section 2 , we put forward diverse theoretical concepts related to AI in systems security. In Section 3 , we present the methodological approach. In Section 4 , we discuss the main fields of use of AI with regard to systems security, which came out from the literature. Finally, we conclude this paper by suggesting implications and future research avenues.

2. Literature Trends: AI and Systems Security

The concept of AI was introduced following the creation of the notion of digital computing machine in an attempt to ascertain whether a machine is able to “think” [ 1 ] or if the machine can carry out humans’ tasks [ 13 ]. AI is a vast domain of information and computer technologies (ICT), which aims at designing systems that can operate autonomously, analogous to the individuals’ decision-making process [ 14 ].In terms of AI, a machine may learn from experience through processing an immeasurable quantity of data while distinguishing patterns in it, as in the case of Siri [ 15 ] and image recognition [ 16 ], technologies based on machine learning that is a subtheme of AI, defined as intelligent systems with the capacity to think and learn [ 1 ].

Furthermore, AI entails a myriad of related technologies, such as neural networks [ 17 ] and machine learning [ 18 ], just to mention a few, and we can identify some research areas of AI:

  • (I) Machine learning is a myriad of technologies that allow computers to carry out algorithms based on gathered data and distinct orders, providing the machine the capabilities to learn without instructions from humans, adjusting its own algorithm to the situation, while learning and recoding itself, such as Google and Siri when performing distinct tasks ordered by voice [ 19 ]. As well, video surveillance that tracks unusual behavior [ 20 ];
  • (II) Deep learning constitutes the ensuing progress of machine learning, in which the machine carry out tasks directly from pictures, text, and sound, through a wide set of data architecture that entails numerous layers in order to learn and characterize data with several levels of abstraction imitating thus how the natural brain processes information [ 21 ]. This is illustrated, for example, in forming a certificate database structure of university performance key indicators, in order to fix issues such as identity authentication [ 21 ];
  • (III) Neural networks are composed of a pattern recognition system that machine/deep learning operates to perform learning from observational data, figuring out its own solutions such as an auto-steering gear system with a fuzzy regulator, which enables to select optimal neural network models of the vessel paths, to obtain in this way control activity [ 22 ];
  • (IV) Natural language processing machines analyze language and speech as it is spoken, resorting to machine learning and natural language processing, such as developing a swarm intelligence and active system, while mounting friendly human-computer interface software for users, to be implemented in educational and e-learning organizations [ 23 ];
  • (V) Expert systems are composed of software arrangements that assist in achieving answers to distinct inquiries provided either by a customer or by another software set, in which expert knowledge is set aside in a particular area of the application that includes a reasoning component to access answers, in view of the environmental information and subsequent decision making [ 24 ].

Those subthemes of AI are applied to many sectors, such as health institutions, education, and management, through varying applications related to systems security. These abovementioned processes have been widely deployed to solve important security issues such as the following application trends ( Figure 1 ):

  • (a) Cyber security, in terms of computer crime, behavior research, access control, and surveillance, as for example the case of computer vision, in which an algorithmic analyses images, CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) techniques [ 6 , 7 , 12 , 19 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 ];
  • (b) Information management, namely in supporting decision making, business strategy, and expert systems, for example, by improving the quality of the relevant strategic decisions by analyzing big data, as well as in the management of the quality of complex objects [ 2 , 4 , 5 , 11 , 14 , 24 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 ];
  • (c) Societies and institutions, regarding computer networks, privacy, and digitalization, legal and clinical assistance, for example, in terms of legal support of cyber security, digital modernization, systems to support police investigations and the efficiency of technological processes in transport [ 8 , 9 , 10 , 15 , 17 , 18 , 20 , 21 , 23 , 28 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , 68 , 69 , 70 , 71 , 72 , 73 ];
  • (d) Neural networks, for example, in terms of designing a model of human personality for use in robotic systems [ 1 , 13 , 16 , 22 , 74 , 75 ].

An external file that holds a picture, illustration, etc.
Object name is sensors-21-07029-g001.jpg

Subthemes/network of all keywords of AI—source: own elaboration.

Through these streams of research, we will explain how the huge potential of AI can be deployed to over-enhance systems security that is in use both in states and organizations, to mitigate risks and increase returns while identifying, averting cyber attacks, and determine the best course of action [ 19 ]. AI could even be unveiled as more effective than humans in averting potential threats by various security solutions such as redundant systems of video surveillance, VOIP voice network technology security strategies [ 36 , 76 , 77 ], and dependence upon diverse platforms for protection (platform complacency) [ 30 ].

The design of the abovementioned conceptual and technological framework was not made randomly, as we did a preliminary search on Scopus with the keywords “Artificial Intelligence” and “Security”.

3. Materials and Methods

We carried out a systematic bibliometric literature review (LRSB) of the “Impact of AI on Data System Security”. the LRSB is a study concept that is based on a detailed, thorough study of the recognition and synthesis of information, being an alternative to traditional literature reviews, improving: (i) the validity of the review, providing a set of steps that can be followed if the study is replicated; (ii) accuracy, providing and demonstrating arguments strictly related to research questions; and (iii) the generalization of the results, allowing the synthesis and analysis of accumulated knowledge [ 78 , 79 , 80 ]. Thus, the LRSB is a “guiding instrument” that allows you to guide the review according to the objectives.

The study is performed following Raimundo and Rosário suggestions as follows: (i) definition of the research question; (ii) location of the studies; (iii) selection and evaluation of studies; (iv) analysis and synthesis; (v) presentation of results; finally (vi) discussion and conclusion of results. This methodology ensures a comprehensive, auditable, replicable review that answers the research questions.

The review was carried out in June 2021, with a bibliographic search in the Scopus database of scientific articles published until June 2021. the search was carried out in three phases: (i) using the keyword Artificial Intelligence “382,586 documents were obtained; (ii) adding the keyword “Security”, we obtained a set of 15,916 documents; we limited ourselves to Business, Management, and Accounting 401 documents were obtained and finally (iii) exact keyword: Data security, Systems security a total of 77 documents were obtained ( Table 1 ).

Screening methodology.

Source: own elaboration.

The search strategy resulted in 77 academic documents. This set of eligible break-downs was assessed for academic and scientific relevance and quality. Academic Documents, Conference Paper (43); Article (29); Review (3); Letter (1); and retracted (1).

Peer-reviewed academic documents on the impact of artificial intelligence on data system security were selected until 2020. In the period under review, 2021 was the year with the highest number of peer-reviewed academic documents on the subject, with 18 publications, with 7 publications already confirmed for 2021. Figure 2 reviews peer-reviewed publications published until 2021.

An external file that holds a picture, illustration, etc.
Object name is sensors-21-07029-g002.jpg

Number of documents by year. Source: own elaboration.

The publications were sorted out as follows: 2011 2nd International Conference on Artificial Intelligence Management Science and Electronic Commerce Aimsec 2011 Proceedings (14); Proceedings of the 2020 IEEE International Conference Quality Management Transport and Information Security Information Technologies IT and Qm and Is 2020 (6); Proceedings of the 2019 IEEE International Conference Quality Management Transport and Information Security Information Technologies IT and Qm and Is 2019 (5); Computer Law and Security Review (4); Journal of Network and Systems Management (4); Decision Support Systems (3); Proceedings 2021 21st Acis International Semi Virtual Winter Conference on Software Engineering Artificial Intelligence Networking and Parallel Distributed Computing Snpd Winter 2021 (3); IEEE Transactions on Engineering Management (2); Ictc 2019 10th International Conference on ICT Convergence ICT Convergence Leading the Autonomous Future (2); Information and Computer Security (2); Knowledge Based Systems (2); with 1 publication (2013 3rd International Conference on Innovative Computing Technology Intech 2013; 2020 IEEE Technology and Engineering Management Conference Temscon 2020; 2020 International Conference on Technology and Entrepreneurship Virtual Icte V 2020; 2nd International Conference on Current Trends In Engineering and Technology Icctet 2014; ACM Transactions on Management Information Systems; AFE Facilities Engineering Journal; Electronic Design; Facct 2021 Proceedings of the 2021 ACM Conference on Fairness Accountability and Transparency; HAC; ICE B 2010 Proceedings of the International Conference on E Business; IEEE Engineering Management Review; Icaps 2008 Proceedings of the 18th International Conference on Automated Planning and Scheduling; Icaps 2009 Proceedings of the 19th International Conference on Automated Planning and Scheduling; Industrial Management and Data Systems; Information and Management; Information Management and Computer Security; Information Management Computer Security; Information Systems Research; International Journal of Networking and Virtual Organisations; International Journal of Production Economics; International Journal of Production Research; Journal of the Operational Research Society; Proceedings 2020 2nd International Conference on Machine Learning Big Data and Business Intelligence Mlbdbi 2020; Proceedings Annual Meeting of the Decision Sciences Institute; Proceedings of the 2014 Conference on IT In Business Industry and Government An International Conference By Csi on Big Data Csibig 2014; Proceedings of the European Conference on Innovation and Entrepreneurship Ecie; TQM Journal; Technology In Society; Towards the Digital World and Industry X 0 Proceedings of the 29th International Conference of the International Association for Management of Technology Iamot 2020; Wit Transactions on Information and Communication Technologies).

We can say that in recent years there has been some interest in research on the impact of artificial intelligence on data system security.

In Table 2 , we analyze for the Scimago Journal & Country Rank (SJR), the best quartile, and the H index by publication.

Scimago journal and country rank impact factor.

Note: * data not available. Source: own elaboration.

Information Systems Research is the most quoted publication with 3510 (SJR), Q1, and H index 159.

There is a total of 11 journals on Q1, 3 journals on Q2 and 2 journals on Q3, and 2 journal on Q4. Journals from best quartile Q1 represent 27% of the 41 journals titles; best quartile Q2 represents 7%, best quartile Q3 represents 5%, and finally, best Q4 represents 5% each of the titles of 41 journals. Finally, 23 of the publications representing 56%, the data are not available.

As evident from Table 2 , the significant majority of articles on artificial intelligence on data system security rank on the Q1 best quartile index.

The subject areas covered by the 77 scientific documents were: Business, Management and Accounting (77); Computer Science (57); Decision Sciences (36); Engineering (21); Economics, Econometrics, and Finance (15); Social Sciences (13); Arts and Humanities (3); Psychology (3); Mathematics (2); and Energy (1).

The most quoted article was “CCANN: An intrusion detection system based on combining cluster centers and nearest neighbors” from Lin, Ke, and Tsai 290 quotes published in the Knowledge-Based Systems with 1590 (SJR), the best quartile (Q1) and with H index (121). the published article proposes a new resource representation approach, a cluster center, and the nearest neighbor approach.

In Figure 3 , we can analyze the evolution of citations of documents published between 2010 and 2021, with a growing number of citations with an R2 of 0.45%.

An external file that holds a picture, illustration, etc.
Object name is sensors-21-07029-g003.jpg

Evolution and number of citations between 2010 and 2021. Source: own elaboration.

The h index was used to verify the productivity and impact of the documents, based on the largest number of documents included that had at least the same number of citations. Of the documents considered for the h index, 11 have been cited at least 11 times.

In Appendix A , Table A1 , citations of all scientific articles until 2021 are analyzed; 35 documents were not cited until 2021.

Appendix A , Table A2 , examines the self-quotation of documents until 2021, in which self-quotation was identified for a total of 16 self-quotations.

In Figure 4 , a bibliometric analysis was performed to analyze and identify indicators on the dynamics and evolution of scientific information using the main keywords. the analysis of the bibliometric research results using the scientific software VOSviewe aims to identify the main keywords of research in “Artificial Intelligence” and “Security”.

An external file that holds a picture, illustration, etc.
Object name is sensors-21-07029-g004.jpg

Network of linked keywords. Source: own elaboration.

The linked keywords can be analyzed in Figure 4 , making it possible to clarify the network of keywords that appear together/linked in each scientific article, allowing us to know the topics analyzed by the research and to identify future research trends.

4. Discussion

By examining the selected pieces of literature, we have identified four principal areas that have been underscored and deserve further investigation with regard to cyber security in general: business decision making, electronic commerce business, AI social applications, and neural networks ( Figure 4 ). There is a myriad of areas in where AI cyber security can be applied throughout social, private, and public domains of our daily lives, from Internet banking to digital signatures.

First, it has been discussed the possible decreasing of unnecessary leakage of accounting information [ 27 ], mainly through security drawbacks of VOIP technology in IP network systems and subsequent safety measures [ 77 ], which comprises a secure dynamic password used in Internet banking [ 29 ].

Second, it has been researched some computer user cyber security behaviors, which includes both a naïve lack of concern about the likelihood of facing security threats and dependence upon specific platforms for protection, as well as the dependence on guidance from trusted social others [ 30 ], which has been partly resolved through a mobile agent (MA) management systems in distributed networks, while operating a model of an open management framework that provides a broad range of processes to enforce security policies [ 31 ].

Third, AI cyber systems security always aims at achieving stability of the programming and analysis procedures by clarifying the relationship of code fault-tolerance programming with code security in detail to strengthen it [ 33 ], offering an overview of existing cyber security tasks and roadmap [ 32 ].

Fourth, in this vein, numerous AI tools have been developed to achieve a multi-stage security task approach for a full security life cycle [ 38 ]. New digital signature technology has been built, amidst the elliptic curve cryptography, of increasing reliance [ 28 ]; new experimental CAPTCHA has been developed, through more interference characters and colorful background [ 8 ] to provide better protection against spambots, allowing people with little knowledge of sign languages to recognize gestures on video relatively fast [ 70 ]; novel detection approach beyond traditional firewall systems have been developed (e.g., cluster center and nearest neighbor—CANN) of higher efficiency for detection of attacks [ 71 ]; security solutions of AI for IoT (e.g., blockchain), due to its centralized architecture of security flaws [ 34 ]; and integrated algorithm of AI to identify malicious web domains for security protection of Internet users [ 19 ].

In sum, AI has progressed lately by advances in machine learning, with multilevel solutions to the security problems faced in security issues both in operating systems and networks, comprehending algorithms, methods, and tools lengthily used by security experts for the better of the systems [ 6 ]. In this way, we present a detailed overview of the impacts of AI on each of those fields.

4.1. Business Decision Making

AI has an increasing impact on systems security aimed at supporting decision making at the management level. More and more, it is discussed expert systems that, along with the evolution of computers, are able to integrate systems into corporate culture [ 24 ]. Such systems are expected to maximize benefits against costs in situations where a decision-making agent has to decide between a limited set of strategies of sparse information [ 14 ], while a strategic decision in a relatively short period of time is required demanded and of quality, for example by intelligent analysis of big data [ 39 ].

Secondly, it has been adopted distributed decision models coordinated toward an overall solution, reliant on a decision support platform [ 40 ], either more of a mathematical/modeling support of situational approach to complex objects [ 41 ], or more of a web-based multi-perspective decision support system (DSS) [ 42 ].

Thirdly, the problem of software for the support of management decisions was resolved by combining a systematic approach with heuristic methods and game-theoretic modeling [ 43 ] that, in the case of industrial security, reduces the subsequent number of incidents [ 44 ].

Fourthly, in terms of industrial management and ISO information security control, a semantic decision support system increases the automation level and support the decision-maker at identifying the most appropriate strategy against a modeled environment [ 45 ] while providing understandable technology that is based on the decisions and interacts with the machine [ 46 ].

Finally, with respect to teamwork, AI validates a theoretical model of behavioral decision theory to assist organizational leaders in deciding on strategic initiatives [ 11 ] while allowing understanding who may have information that is valuable for solving a collaborative scheduling problem [ 47 ].

4.2. Electronic Commerce Business

The third research stream focuses on e-commerce solutions to improve its systems security. This AI research stream focuses on business, principally on security measures to electronic commerce (e-commerce), in order to avoid cyber attacks, innovate, achieve information, and ultimately obtain clients [ 5 ].

First, it has been built intelligent models around the factors that induce Internet users to make an online purchase, to build effective strategies [ 48 ], whereas it is discussed the cyber security issues by diverse AI models for controlling unauthorized intrusion [ 49 ], in particular in some countries such as China, to solve drawbacks in firewall technology, data encryption [ 4 ] and qualification [ 2 ].

Second, to adapt to the increasingly demanding environment nowadays of a world pandemic, in terms of finding new revenue sources for business [ 3 ] and restructure business digital processes to promote new products and services with enough privacy and manpower qualified accordingly and able to deal with the AI [ 50 ].

Third, to develop AI able to intelligently protect business either by a distinct model of decision trees amidst the Internet of Things (IoT) [ 51 ] or by ameliorating network management through active networks technology, of multi-agent architecture able to imitate the reactive behavior and logical inference of a human expert [ 52 ].

Fourth, to reconceptualize the role of AI within the proximity’s spatial and non-spatial dimensions of a new digital industry framework, aiming to connect the physical and digital production spaces both in the traditional and new technology-based approaches (e.g., industry 4.0), promoting thus innovation partnerships and efficient technology and knowledge transfer [ 53 ]. In this vein, there is an attempt to move the management systems from a centralized to a distributed paradigm along the network and based on criteria such as for example the delegation degree [ 54 ] that inclusive allows the transition from industry 4.0 to industry 5.0i, through AI in the form of Internet of everything, multi-agent systems and emergent intelligence and enterprise architecture [ 58 ].

Fifth, in terms of manufacturing environments, following that networking paradigm, there is also an attempt to manage agent communities in distributed and varied manufacturing environments through an AI multi-agent virtual manufacturing system (e.g., MetaMorph) that optimizes real-time planning and security [ 55 ]. In addition, in manufacturing, smart factories have been built to mitigate security vulnerabilities of intelligent manufacturing processes automation by AI security measures and devices [ 56 ] as, for example, in the design of a mine security monitoring configuration software platform of a real-time framework (e.g., the device management class diagram) [ 26 ]. Smart buildings in manufacturing and nonmanufacturing environments have been adopted, aiming at reducing costs, the height of the building, and minimizing the space required for users [ 57 ].

Finally, aiming at augmenting the cyber security of e-commerce and business in general, other projects have been put in place, such as computer-assisted audit tools (CAATs), able to carry on continuous auditing, allowing auditors to augment their productivity amidst the real-time accounting and electronic data interchange [ 59 ] and a surge in the demand of high-tech/AI jobs [ 60 ].

4.3. AI Social Applications

As seen, AI systems security can be widely deployed across almost all society domains, be in regulation, Internet security, computer networks, digitalization, health, and other numerous fields (see Figure 4 ).

First, it has been an attempt to regulate cyber security, namely in terms of legal support of cyber security, with regard to the application of artificial intelligence technology [ 61 ], in an innovative and economical/political-friendly way [ 9 ] and in fields such as infrastructures, by ameliorating the efficiency of technological processes in transport, reducing, for example, the inter train stops [ 63 ] and education, by improving the cyber security of university E-Gov, for example in forming a certificate database structure of university performance key indicators [ 21 ] e-learning organizations by swarm intelligence [ 23 ] and acquainting the risk a digital campus will face according to ISO series standards and criteria of risk levels [ 25 ] while suggesting relevant solutions to key issues in its network information safety [ 12 ].

Second, some moral and legal issues have risen, in particular in relation to privacy, sex, and childhood. Is the case of the ethical/legal legitimacy of publishing open-source dual-purpose machine-learning algorithms [ 18 ], the needed legislated framework comprising regulatory agencies and representatives of all stakeholder groups gathered around AI [ 68 ], the gendering issue of VPAs as female (e.g., Siri) as replicate normative assumptions about the potential role of women as secondary to men [ 15 ], the need of inclusion of communities to uphold its own code [ 35 ] and the need to improve the legal position of people and children in particular that are exposed to AI-mediated risk profiling practices [ 7 , 69 ].

Third, the traditional industry also benefits from AI, given that it can improve, for example, the safety of coal mine, by analyzing the coal mine safety scheme storage structure, building data warehouse and analysis [ 64 ], ameliorating, as well, the security of smart cities and ensuing intelligent devices and networks, through AI frameworks (e.g., United Theory of Acceptance and Use of Technology—UTAUT) [ 65 ], housing [ 10 ] and building [ 66 ] security system in terms of energy balance (e.g., Direct Digital Control System), implying fuzzy logic as a non-precise program tool that allows the systems to function well [ 66 ], or even in terms of data integrity attacks to outage management system OMSs and ensuing AI means to detect and mitigate them [ 67 ].

Fourth, the citizens, in general, have reaped benefits from areas of AI such as police investigation, through expert systems that offer support in terms of profiling and tracking criminals based on machine-learning and neural network techniques [ 17 ], video surveillance systems of real-time accuracy [ 76 ], resorting to models to detect moving objects keeping up with environment changes [ 36 ], of dynamical sensor selection in processing the image streams of all cameras simultaneously [ 37 ], whereas ambient intelligence (AmI) spaces, in where devices, sensors, and wireless networks, combine data from diverse sources and monitor user preferences and their subsequent results on users’ privacy under a regulatory privacy framework [ 62 ].

Finally, AI has granted the society noteworthy progress in terms of clinical assistance in terms of an integrated electronic health record system into the existing risk management software to monitor sepsis at intensive care unit (ICU) through a peer-to-peer VPN connection and with a fast and intuitive user interface [ 72 ]. As well, it has offered an AI organizational solution of innovative housing model that combines remote surveillance, diagnostics, and the use of sensors and video to detect anomalies in the behavior and health of the elderly [ 20 ], together with a case-based decision support system for the automatic real-time surveillance and diagnosis of health care-associated infections, by diverse machine-learning techniques [ 73 ].

4.4. Neural Networks

Neural networks, or the process through which machines learn from observational data, coming up with their own solutions, have been lately discussed over some stream of issues.

First, it has been argued that it is opportune to develop a software library for creating artificial neural networks for machine learning to solve non-standard tasks [ 74 ], along a decentralized and integrated AI environment that can accommodate video data storage and event-driven video processing, gathered from varying sources, such as video surveillance systems [ 16 ], which images could be improved through AI [ 75 ].

Second, such neural networks architecture has progressed into a huge number of neurons in the network, in which the devices of associative memory were designed with the number of neurons comparable to the human brain within supercomputers [ 1 ]. Subsequently, such neural networks can be modeled on the base of switches architecture to interconnect neurons and to store the training results in the memory, on the base of the genetic algorithms to be exported to other robotic systems: a model of human personality for use in robotic systems in medicine and biology [ 13 ].

Finally, the neural network is quite representative of AI, in the attempt of, once trained in human learning and self-learning, could operate without human guidance, as in the case of a current positioning vessel seaway systems, involving a fuzzy logic regulator, a neural network classifier enabling to select optimal neural network models of the vessel paths, to obtain control activity [ 22 ].

4.5. Data Security and Access Control Mechanisms

Access control can be deemed as a classic security model that is pivotal do any security and privacy protection processes to support data access from different environments, as well as to protect unauthorized access according to a given security policy [ 81 ]. In this vein, data security and access control-related mechanisms have been widely debated these days, particularly with regard to their distinct contextual conditions in terms, for example, of spatial and temporal environs that differ according to diverse, decentralized networks. Those networks constitute a major challenge because they are dynamically located on “cloud” or “fog” environments, rather than fixed desktop structures, demanding thus innovative approaches in terms of access security, such as fog-based context-aware access control (FB-CAAC) [ 81 ]. Context-awareness is, therefore, an important characteristic of changing environs, where users access resources anywhere and anytime. As a result, it is paramount to highlight the interplay between the information, now based on fuzzy sets, and its situational context to implement context-sensitive access control policies, as well, through diverse criteria such as, for example, following subject and action-specific attributes. In this way, different contextual conditions, such as user profile information, social relationship information, and so on, need to be added to the traditional, spatial and temporal approaches to sustain these dynamic environments [ 81 ]. In the end, the corresponding policies should aim at defining the security and privacy requirements through a fog-based context-aware access control model that should be respected for distributed cloud and fog networks.

5. Conclusion and Future Research Directions

This piece of literature allowed illustrating the AI impacts on systems security, which influence our daily digital life, business decision making, e-commerce, diverse social and legal issues, and neural networks.

First, AI will potentially impact our digital and Internet lives in the future, as the major trend is the emergence of increasingly new malicious threats from the Internet environment; likewise, greater attention should be paid to cyber security. Accordingly, the progressively more complexity of business environment will demand, as well, more and more AI-based support systems to decision making that enables management to adapt in a faster and accurate way while requiring unique digital e-manpower.

Second, with regard to the e-commerce and manufacturing issues, principally amidst the world pandemic of COVID-19, it tends to augment exponentially, as already observed, which demands subsequent progress with respect to cyber security measures and strategies. the same, regarding the social applications of AI that, following the increase in distance services, will also tend to adopt this model, applied to improved e-health, e-learning, and e-elderly monitoring systems.

Third, subsequent divisive issues are being brought to the academic arena, which demands progress in terms of a legal framework, able to comprehend all the abovementioned issues in order to assist the political decisions and match the expectations of citizens.

Lastly, it is inevitable further progress in neural networks platforms, as it represents the cutting edge of AI in terms of human thinking imitation technology, the main goal of AI applications.

To summarize, we have presented useful insights with respect to the impact of AI in systems security, while we illustrated its influence both on the people’ service delivering, in particular in security domains of their daily matters, health/education, and in the business sector, through systems capable of supporting decision making. In addition, we over-enhance the state of the art in terms of AI innovations applied to varying fields.

Future Research Issues

Due to the aforementioned scenario, we also suggest further research avenues to reinforce existing theories and develop new ones, in particular the deployment of AI technologies in small medium enterprises (SMEs), of sparse resources and from traditional sectors that constitute the core of intermediate economies and less developed and peripheral regions. In addition, the building of CAAC solutions constitutes a promising field in order to control data resources in the cloud and throughout changing contextual conditions.

Acknowledgments

We would like to express our gratitude to the Editor and the Referees. They offered extremely valuable suggestions or improvements. the authors were supported by the GOVCOPP Research Unit of Universidade de Aveiro and ISEC Lisboa, Higher Institute of Education and Sciences.

Overview of document citations period ≤ 2010 to 2021.

Overview of document self-citation period ≤ 2010 to 2020.

Author Contributions

Conceptualization, R.R. and A.R.; data curation, R.R. and A.R.; formal analysis, R.R. and A.R.; funding acquisition, R.R. and A.R.; investigation, R.R. and A.R.; methodology, R.R. and A.R.; project administration, R.R. and A.R.; software, R.R. and A.R.; validation, R.R. and A.R.; resources, R.R. and A.R.; writing—original draft preparation, R.R. and A.R.; writing—review and editing, R.R. and A.R.; visualization, R.R. and A.R.; supervision, R.R. and A.R.; project administration, R.R. and A.R.; All authors have read and agreed to the published version of the manuscript.

This research received no external funding.

Institutional Review Board Statement

Informed consent statement, data availability statement, conflicts of interest.

The authors declare no conflict of interest. the funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This paper is in the following e-collection/theme issue:

Published on 11.4.2024 in Vol 26 (2024)

This is a member publication of Imperial College London (Jisc)

Regulatory Standards and Guidance for the Use of Health Apps for Self-Management in Sub-Saharan Africa: Scoping Review

Authors of this article:

Author Orcid Image

  • Benard Ayaka Bene 1, 2 , MBBS, MPH   ; 
  • Sunny Ibeneme 3 , MD, PhD   ; 
  • Kayode Philip Fadahunsi 1 , MBBS, MPH   ; 
  • Bala Isa Harri 4 , MBBS, MPH, MSc   ; 
  • Nkiruka Ukor 5 , MSc   ; 
  • Nikolaos Mastellos 1 , BSc, PhD   ; 
  • Azeem Majeed 1 , MD   ; 
  • Josip Car 1, 6 , MSc, MD, PhD  

1 Department of Primary Care and Public Health, School of Public Health, Imperial College London, London, United Kingdom

2 Department of Public Health, Federal Ministry of Health, Abuja, Nigeria

3 Digital Health Specialist, UNICEF East Asia Pacific Regional Office, Bangkok, Thailand

4 Department of Health Planning, Research and Statistics, Federal Ministry of Health, Abuja, Nigeria

5 Strategic Health Information Cluster, World Health Organization, Abuja, Nigeria

6 School of Life Course & Population Sciences, King’s College London, London, United Kingdom

Corresponding Author:

Benard Ayaka Bene, MBBS, MPH

Department of Primary Care and Public Health

School of Public Health

Imperial College London

The Reynolds Building

St Dunstan’s Road

London, W6 8RP

United Kingdom

Phone: 44 7598439185

Email: [email protected]

Background: Health apps are increasingly recognized as crucial tools for enhancing health care delivery. Many countries, particularly those in sub-Saharan Africa, can substantially benefit from using health apps to support self-management and thus help to achieve universal health coverage and the third sustainable development goal. However, most health apps published in app stores are of unknown or poor quality, which poses a risk to patient safety. Regulatory standards and guidance can help address this risk and promote patient safety.

Objective: This review aims to assess the regulatory standards and guidance for health apps supporting evidence-based best practices in sub-Saharan Africa with a focus on self-management.

Methods: A methodological framework for scoping reviews was applied. A search strategy was built and applied across the following databases, gray literature sources, and institutional websites: PubMed, Scopus, World Health Organization (WHO) African Index Medicus, OpenGrey, WHO Regional Office for Africa Library, ICTworks, WHO Directory of eHealth policies, HIS Strengthening Resource Center, International Telecommunication Union, Ministry of Health websites, and Google. The search covered the period between January 2005 and January 2024. The findings were analyzed using a deductive descriptive content analysis. The policy analysis framework was adapted and used to organize the findings. The Reporting Items for Stakeholder Analysis tool guided the identification and mapping of key stakeholders based on their roles in regulating health apps for self-management.

Results: The study included 49 documents from 31 sub-Saharan African countries. While all the documents were relevant for stakeholder identification and mapping, only 3 regulatory standards and guidance contained relevant information on regulation of health apps. These standards and guidance primarily aimed to build mutual trust; promote integration, inclusion, and equitable access to services; and address implementation issues and poor coordination. They provided guidance on systems quality, software acquisition and maintenance, security measures, data exchange, interoperability and integration, involvement of relevant stakeholders, and equitable access to services. To enhance implementation, the standards highlight that legal authority, coordination of activities, building capacity, and monitoring and evaluation are required. A number of stakeholders, including governments, regulatory bodies, funders, intergovernmental and nongovernmental organizations, academia, and the health care community, were identified to play key roles in regulating health apps.

Conclusions: Health apps have huge potential to support self-management in sub-Saharan Africa, but the lack of regulatory standards and guidance constitutes a major barrier. Hence, for these apps to be safely and effectively integrated into health care, more attention should be given to regulation. Learning from countries with effective regulations can help sub-Saharan Africa build a more robust and responsive regulatory system, ensuring the safe and beneficial use of health apps across the region.

International Registered Report Identifier (IRRID): RR2-10.1136/bmjopen-2018-025714

Introduction

Health apps are the most widely used digital health products globally [ 1 , 2 ]. Harnessing the potential of health apps creates a huge opportunity in providing support for health care delivery, including patient communication, patient education, and decision support for self-management [ 3 - 8 ]. Health apps can be an effective tool to strengthen health systems worldwide, especially in low- and middle-income countries including those in sub-Saharan Africa [ 4 , 5 , 9 ]. As a result, the attainment of universal health coverage (UHC) and sustainable development goal (SDG) 3, good health and well-being, can be accelerated [ 8 , 10 ].

Many health apps fall below the expected quality threshold [ 11 ]. Several studies have found that widely used health apps are often technically unreliable and clinically unsafe [ 12 - 14 ] and do not comply with ethical standards and the principles of confidentiality of information and data privacy [ 15 , 16 ]. In addition, many commercially available health apps were not developed using interoperability standards that are widely accepted in sub-Saharan Africa (eg, Fast Healthcare Interoperability Resources [FHIR]) [ 17 - 20 ]. Consequently, it becomes difficult to integrate these apps into a clinical workflow.

Hence, regulation through robust mechanisms is crucial to enhance the development, implementation, and adoption of health apps. Regulatory standards and guidance are essential for the safety of patients as they ensure quality assurance of any new technology in health care and contribute to building mutual trust while promoting the optimal use of the technology [ 21 - 23 ]. Therefore, to ensure that health apps that are used to support the self-management of patients are technically reliable and clinically safe, interoperable across systems, and compliant with the principles of confidentiality of information and data privacy, there is a need for effective regulatory standards. Furthermore, effective regulation can help ensure that health apps for self-management are culturally functional and competent and are accessible to those who need them regardless of gender, ethnicity, geographical location, or financial status [ 24 - 31 ].

Since 2005, there have been ongoing efforts to strengthen digital health governance at both the national and international levels [ 32 , 33 ]. In 2018, the World Health Organization (WHO) member states renewed their commitment to using digital health technologies (DHTs) to advance UHC and SDG 3 [ 33 ]. However, to date, the extent to which the use of health apps for self-management is regulated across countries within the WHO African Region (also known as sub-Saharan Africa) remains unclear. Therefore, this review was conducted to identify available regulatory standards and guidance and assess the extent to which they regulate health apps for self-management in sub-Saharan Africa. The review also mapped out the key stakeholders and their roles in regulating health apps for self-management across sub-Saharan Africa.

Review Questions

The review attempted to answer the following questions: (1) What regulatory standards and guidance are available for regulating health apps for self-management across sub-Saharan Africa? (2) To what extent do regulatory standards and guidance regulate health apps for self-management in terms of what aspects are regulated; why, how, and for whom; and what aspects are not regulated? (3) Who are the key stakeholders and what are their roles in regulating health apps for self-management?

Study Design

The process of this scoping review followed the methodological framework for conducting a scoping study originally described by Arksey and O’Malley [ 34 ] and the updated methodological guidance for conducting a Joanna Briggs Institute scoping review [ 34 - 37 ]. The reporting of the review was guided by the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist [ 38 ]. A completed PRISMA-ScR checklist is provided in Multimedia Appendix 1 . The protocol of this scoping review was published in BMJ Open [ 30 ].

Identifying Relevant Documents

Two reviewers (BAB and SI) developed the search strategy with the assistance of a librarian and in consultation with other research team members (KPF, BIH, NU, NM, AM, and JC). The following key terms were included: policy, legislation, strategy, regulation, standard, criterion, framework, guidance, guideline, digital health, eHealth, app, WHO African Region, and sub-Saharan Africa, and the names of all sub-Saharan African countries.

Owing to the absence of regulatory standards and guidance in scientific databases, the search focus was narrowed down to gray literature sources and institutional websites, including OpenGrey, WHO Regional Office for Africa (AFRO) Library, repositories for digital health policies (ICTworks, WHO’s Directory of eHealth Policies, and Health Information System Strengthening Resource Center), as well as the websites of WHO, International Telecommunication Union (ITU), and Ministries of Health (MOHs). The only scientific databases searched were PubMed, Scopus, and WHO AIM. PubMed was not included in the protocol. We also conducted a systematic search on Google. We used truncation to increase the yield of the results. The search strategy was then applied across PubMed, Scopus, and WHO AIM databases using Boolean terms (mainly OR and AND ) to combine search results. Gray literature sources and institutional websites were searched using phrases containing ≥2 keywords such as “eHealth regulation,” “digital health regulatory standard,” “eHealth regulatory standard,” “digital health regulation,” “digital health policy,” “eHealth policy,” “digital health strategy,” and “eHealth strategy.” For Google search, we added the names of the country to the phrases (eg, “digital health regulation Nigeria”). The reference lists of the included documents were also searched, and key individuals at the MOHs, WHO Country Offices, and the WHO AFRO were contacted for related documents. When our search was conducted, the WHO Directory of eHealth policies website was unavailable, and the WHO AFRO Library was undergoing reconstruction. The search strategies for PubMed, Scopus, and WHO AIM are provided in Multimedia Appendix 2 . The search was conducted between 2005 and January 2024.

Study Selection

The search results obtained from PubMed, Scopus, and WHO AIM were imported into Mendeley (Elsevier) [ 39 ] to remove duplicates. The search conducted on OpenGrey did not yield any results, whereas relevant records obtained from institutional websites, repositories, and Google were downloaded as PDF copies and uploaded to Mendeley. After removing duplicates, the remaining results were imported into Covidence (Veritas Health Innovation) [ 40 ] for screening. Two reviewers (BAB and SI) applied the predefined eligibility criteria ( Textbox 1 ) to screen the documents in 2 stages (title and abstract or executive summary). All discrepancies were discussed until the reviewers reached agreement.

Inclusion criteria

  • Type of document: Regulatory standards, guidance, policies, strategies, and committee or government reports that address regulatory issues related to the use of health apps for self-management
  • Location: Documents developed and implemented in countries within sub-Saharan Africa
  • Date of publication: Documents developed since 2005; the global efforts toward promoting standards to minimize variability and potential harms that could arise from poorly regulated use of digital health began in 2005 [ 33 ]
  • Language: Documents written in English language and other official languages of sub-Saharan African countries (Portuguese and French)

Exclusion criteria

  • Type of document: Standards, guidance, policies, strategies, and reports not related to regulation of health apps
  • Location: Documents from countries outside sub-Saharan Africa
  • Date of publication: Documents developed before 2005
  • Language: None

Data Charting (Extraction)

Two reviewers (BAB and SI), in consultation with the other members of the research team, developed the data extraction forms using an iterative process that included piloting data extraction and refinement until a consensus was reached.

We proposed in the study protocol [ 30 ] that data extraction would be conducted by the 2 reviewers independently. However, owing to the approach adopted for data extraction (deductive qualitative content analysis), 1 reviewer, rather than 2, initially extracted data from the included documents, and any concerns were discussed with a second reviewer [ 41 ]. Unresolved issues were then discussed and resolved with a third reviewer in a steering group meeting.

Collating, Summarizing, and Reporting Results

To address the research questions (particularly question 2), we adopted a deductive descriptive qualitative content analysis method to analyze and report the key findings. The policy analysis framework by Walt and Gilson [ 42 ] was adapted and applied to ensure that there was a consistent way of organizing the key findings: (1) Content (which aspects are regulated and which aspects are not?)—these are the components that directly or indirectly address regulatory issues related to the use of health apps for self-management, including areas that have not been addressed. (2) Context (why are those aspects regulated?)—this characterizes the rationale indicated for addressing regulatory issues related to the use of health apps for self-management. (3) Process (how are the regulatory standards developed and implemented?)—this describes the methods or approaches used to develop and implement regulatory standards. (4) Actors (who are the regulatory standards targeted toward?)—these are the key actors targeted by the standards.

Using a deductive descriptive qualitative content analysis approach, we examined each included document to systematically identify texts for concepts, patterns, and other relevant information. We then categorized them under content, context, process, or actors in relation to regulating health apps for self-management. The findings under content and context were further organized based on 4 predefined regulatory categories or themes as documented in the literature, namely (1) technical and clinical safety [ 12 - 14 ], (2) data protection and security [ 15 , 16 ], (3) standards and interoperability [ 28 , 31 ], and (4) inclusion and equitable access [ 24 - 29 ].

To address the third research question, the Reporting Items for Stakeholder Analysis (RISA) tool [ 41 ] was used as a guide to group key stakeholders based on role categorization as recognized globally by the WHO, the ITU, and UNESCO [ 32 , 33 , 43 ].

Ethical Considerations

Primary data were not collected in this study. Therefore, no ethics approval was required.

Search Results

A total of 2900 records were obtained after removing duplicates. Although the literature search was conducted in English, the search also yielded documents written in French and Portuguese from the ICTworks repository [ 44 ]. Following the initial screening of the title and abstract (or executive summaries), 73 documents were retrieved for full-text assessment. After applying the inclusion criteria for the full-text assessment, 49 documents were found eligible for inclusion in the review.

The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram [ 45 ] showing the study selection process is presented in Figure 1 .

database security research paper

Types of Documents

On the basis of the inclusion criteria, 3 categories of documents were considered for this review, namely “stand-alone regulatory standards and guidance that potentially regulate health apps for self-management,” “national policies and strategies on digital health,” and “other national documents that relate to the regulation of health apps for self-management.” Table 1 presents the types of documents obtained for each country within sub-Saharan Africa.

Characteristics of the Included Documents

Stand-alone regulatory standards and guidance.

We identified and included 6 stand-alone regulatory standards [ 18 , 19 , 46 - 49 ] from 3 countries (Ethiopia, Kenya, and Nigeria). All 6 documents were written in English. The years of development ranged between 2013 and 2021, as indicated in Multimedia Appendix 3 . The years of implementation were not specifically stated.

Although none of the included regulatory standards were exclusively developed to regulate health apps for self-management, 3 of them (Kenya Standards and Guidelines for mHealth Systems [ 18 ], Kenya Standards and Guidelines for E-Health Systems Interoperability [ 47 ], and Health Sector Information and Communications Technology Standards and Guidelines [ 48 ]) provided concept and information relevant to the regulation of health apps and were included in the qualitative content analysis. The Kenya Standards and Guidelines for mHealth Systems [ 18 ] provides standards and guidelines on the design, development, and implementation of mobile health (mHealth) solutions to ensure they are interoperable, scalable, and sustainable. The Kenya Standards and Guidelines for E-Health Systems Interoperability [ 47 ] outlines the principles, requirements, and standards for eHealth systems interoperability in Kenya. The Health Sector Information and Communications Technology Standards and Guidelines [ 48 ] provide guidance and a consistent approach across the health sector in Kenya for establishing, acquiring, and maintaining current and future information systems and information and communications technology (ICT) infrastructure that foster interoperability across systems. These 3 documents are a good combination of regulatory standards and guidance that provide content and context relevant to the regulation of health apps in sub-Saharan Africa.

The remaining 3 standards (standard for electronic health record [EHR] system in Ethiopia [ 19 ], standards and guidelines for electronic medical record systems in Kenya [ 46 ], and the health information exchange standard operating procedure and guideline [ 49 ]) were exclusively developed for EHRs or electronic medical records. However, they contain information relevant for mapping stakeholders with potential roles in regulating health apps for supporting self-management.

National Policies and Strategies on Digital Health

This review includes 35 national policies and strategies that are related to digital health (potentially covering health apps) [ 50 - 84 ] from 31 countries written in English, French, and Portuguese (Benin, Botswana, Burkina Faso, Burundi, Cameroon, Comoros, Côte d’Ivoire [Ivory Coast], Democratic Republic of the Congo, Eswatini, Ethiopia, Gabon, Ghana, Kenya, Liberia, Madagascar, Malawi, Mali, Mauritius, Mozambique, Namibia, Niger, Nigeria, Rwanda, Senegal, Sierra Leone, South Africa, Tanzania, Togo, Uganda, Zambia, and Zimbabwe). Although the literature search was conducted in English, it also yielded documents written in French and Portuguese from the ICTworks repository. The years of development and implementation range between 2005 and 2030. Policies and strategies written in French and Portuguese were translated into English using Google Translate. Documents labeled as national development plans, strategic plans, and strategic development plans were considered as national strategies.

National policies and strategies do not offer specific standards or guidance, but rather outline the country’s vision, policy directions, and strategies for using digital technologies in health care. They provide useful information for identifying digital health stakeholders who can play a role in regulating health apps for self-management. For example, Nigeria has a separate National Digital Health Policy [ 72 ] and a National Digital Health Strategy [ 71 ]. Both documents were developed by building on the lessons learned from the end-term evaluation of the previous National Health ICT Strategic Framework [ 85 ]. They describe Nigeria’s renewed vision, mission, goals, objectives, and strategies for the development and implementation of digital health with the aim to improve the quality, efficiency, and effectiveness of health service delivery and health outcomes.

It is worth noting that for countries with >1 policy or strategy, we included only the most recent versions. For instance, as mentioned earlier, Nigeria now has both a national digital health policy and a national digital health strategy. These 2 documents supersede and thus replace the old National Health ICT Strategic Framework [ 86 ]. Details of included documents are presented in Multimedia Appendix 3 .

Other Related National Documents

We included 8 other documents [ 20 , 85 , 87 - 92 ] from 6 countries (Ethiopia, Kenya, Liberia, Nigeria, South Africa, and Tanzania) that did not fall under either stand-alone regulatory standards and guidance or national policies and strategies. These were mostly frameworks, road maps, and reports that potentially provide information relevant to the use of health apps. The years of development and implementation range from 2016 to 2025. These documents do not provide standards or guidance, but they contain information that can help map the digital health stakeholders that potentially play a role in regulating health apps for self-management. When multiple versions of a document exist, only the latest version was taken into consideration. Multimedia Appendix 3 provides details of the included documents.

Content: Aspects That Are Regulated and Aspects That Are Not

Technical and clinical safety.

Technical and clinical safety standards are required to prevent or minimize the harm that may arise from the use of the health ICT systems (including mHealth systems) as well as to improve the health outcomes and user satisfaction. As shown in Figure 2 , two subthemes were generated from included standards [ 18 , 47 , 48 ] as content under technical and clinical safety: v(1) guidance on system quality and (2) guidance on software or app development, acquisition, support, and maintenance.

database security research paper

Notably, 2 of the included standards [ 18 , 47 ] provide guidance on system quality to ensure the quality, security, reliability, performance, and maintenance of eHealth and mHealth systems. The Kenya Standards and Guidelines for E-Health Systems Interoperability [ 47 ] recommend the implementation of a data quality protocol to ensure that the data collection, collation, analysis, interpretation, dissemination, and use are managed in accordance with the quality standards. Similarly, the Kenya Standards and Guidelines for mHealth Systems [ 18 ] recommends the inclusion of the following requirements in the technical manual: (1) minimum hardware requirements that should incorporate the preferred hardware architecture, (2) minimum software requirements that should include the minimum version of the underlying operating system as well as acceptable versions of related software, and (3) a detailed list of software dependencies (external libraries) necessary for the system to function properly.

The included standards [ 18 , 48 ] cover guidance on software or app development, acquisition, support, and maintenance, which aim to ensure the efficiency and effectiveness of eHealth and mHealth systems. The Kenya Standards and Guidelines for mHealth Systems [ 18 ] recommends a technical manual to provide a detailed description of the system’s installation and maintenance processes for system administrators and implementers; a developer’s guide for software developers and programmers to provide them with an overview of the system, description of the software design methodologies, description of the system architecture, and technical design diagrams; and a user manual to aid users in understanding how the system works and how each feature operates; in addition, the technical manual contains instructions for operating the software; entering and updating data; and generating, saving, and printing reports.

Although the contents generated here provide guidance that is relevant to health apps, they are not specific to health apps. Moreover, there are no clear measures to enable individuals or organizations that use health apps to manage clinical risk appropriately.

Data Protection and Security

Data protection and security are crucial aspects of managing patient information, thus ensuring the confidentiality, integrity, and availability of data as well as the rights and interests of the patient. Two subthemes related to data protection and security are (1) security measures for adequate protection of patients’ digital records and (2) guidance on data exchange.

The included standards [ 18 , 48 ] provide security measures for eHealth or mHealth systems to ensure the adequate protection of digitally accessible patient records. These measures include authentication, accountability, identification, authorization, integrity, confidentiality, availability, security, administration, and audit. This will help to achieve confidentiality, integrity, availability, and nonrepudiation of patient data or health records. Additional levels of security such as data encryption are required when there is a need to store sensitive information on removable devices or media or outside the MOH premises.

The Kenya Standards and Guidelines for mHealth Systems [ 18 ] provide the following guidance on data exchange to ensure privacy: (1) anonymize client data as much as possible before they can be shared; (2) where possible, use pseudonyms for the client data before they can be shared; (3) aggregate client data before they can be shared to reduce possibilities of tracing the data back to the client; and (4) minimize data so that access is available only to the data set required for that particular use. With regard to privacy rules, the Kenya Standards and Guidelines for E-Health Systems Interoperability [ 47 ] propose that a notice of privacy practices should be given to patients describing how their information may be used or shared while also specifying their legal rights.

Standards and Interoperability

Standards and interoperability are essential concepts in the field of IT, especially for systems that need to communicate and exchange data, as seen in the use of health apps for self-management. Two subthemes related to standards and interoperability are (1) interoperability as a basic requirement and (2) minimum standards to enable integration.

All the regulatory standards [ 18 , 47 , 48 ] highlight the importance of having interoperability as a basic requirement when selecting software products or services for use within the health system. This facilitates interaction across systems. For instance, to facilitate seamless interaction between mHealth systems and primary information systems for data capture, reporting, and decision support in various domains of the health system, the Kenya Standards and Guidelines for mHealth Systems [ 18 ] recommends the incorporation of at least 3 types of interoperability, namely, technical interoperability, semantic interoperability, and process interoperability.

Furthermore, 2 regulatory standards [ 18 , 47 ] proposed minimum interoperability standards to enable the integration of services and data exchange between various systems in health care. For instance, the Kenya Standards and Guidelines for mHealth Systems [ 18 ] suggests standards (for interoperability) for mHealth systems that are consistent with the recommendations in internationally accepted standards. They include the following: (1) clinical messaging—ensuring mHealth systems conform to Health Level 7 (HL7) version 3 standards and corresponding implementation guideline; (2) clinical terminology—ensuring terminologies and classifications for clinical concepts (eg, International Classification of Diseases, tenth revision—for diseases; Systemized Nomenclature of Medicine—for clinical data coding; Logical Observation Identifiers Names and Codes—for laboratories; and RxNorm—for Pharmacies); (3) the mHealth system must use the latest versions of international standards, such as HL7 Clinical Document Architecture for electronic sharing of clinical documents; (4) concepts—mHealth systems will use the idea of “concepts” so that information can be transmitted between systems without losing meaning or context, and HL7 Reference Implementation Model or other appropriate standards are recommended for implementing concepts; (5) architecture—to develop mHealth systems, developers should define the system architecture that should include data elements and business logic. Furthermore, to define how mHealth systems interact with other systems, developers of mHealth solutions must provide application programming interfaces. FHIR is the preferred application programming interface interoperability standard.

Inclusion and Equitable Access

Inclusion and equitable access are essential principles to ensure that health apps are culturally appropriate and relevant and accessible to everyone, regardless of gender, ethnicity, location, or economic status.

All the included regulatory standards [ 18 , 47 , 48 ] indicate that they were developed based on a combination of participatory and consultative approaches involving multiple actors or stakeholders, thus promoting inclusion. However, there are no specific measures or guidance to ensure adequate engagement and representation of all the relevant stakeholders and to sustain that engagement.

The Kenya Standards and Guidelines for mHealth Systems [ 18 ] proposes the following systems attributes to ensure equitable access to mHealth services at all times and from anywhere: (1) allocation of adequate storage and bandwidth capacity; (2) fast response time; (3) fast recovery capabilities; (4) performance monitoring; (5) business continuity processes, for example, backups; and (6) redundant sites and links. Furthermore, the Kenya Standards and Guidelines for mHealth Systems [ 18 ] prescribes the following metrics for measuring system availability: (1) downtime per year, (2) mean time between failure, (3) mean time to repair, and (4) failure in time.

Although the abovementioned systems attributes and metrics for measuring system availability are important, the included standards do not offer any concrete guidance or model for achieving a sustainable funding mechanism for health apps to ensure that they are readily available and accessible to those who need them.

Context: Reasons Why Those Aspects Are Regulated

The 3 standards [ 18 , 47 , 48 ] were developed to address unsafe, isolated, and inconsistent implementation. The Health Sector ICT Standards and Guidelines [ 48 ] suggest that although there has been a lot of ICT investment in the health sector leading to improvement in service delivery and information exchange, there remains the challenge of inconsistency in ICT implementation and harmonization of the health sector system requirements. Hence, there is a need to adopt global best practices for software development, acquisition, support, and maintenance by the MOH. In addition, the Kenya Standards and Guidelines for mHealth Systems [ 18 ] indicates that standards and guidelines are necessary to ensure a consistent approach to the development of ICT systems. Similarly, the Kenya Standards and Guidelines for E-Health Systems Interoperability [ 47 ] recognize the need to ensure that the processes of collecting, collating, analyzing, interpreting, disseminating, and using data are consistent with data quality standards.

To build mutual trust and maximize the benefits of eHealth information exchange, the Kenya Standards and Guidelines for E-Health Systems Interoperability [ 47 ] reiterate that as health data are constantly being exchanged across health information systems, robust security standards are required to maintain their integrity and confidentiality. This will build the trust of service users and consequently help to maximize the benefits of eHealth information exchange such as in self-management.

Two of the included regulatory standards [ 47 , 48 ] indicate that the context for standards and interoperability was (1) to address poor coordination, duplication of efforts, and inefficient use of resources and (2) to promote the integration of ICT systems.

The Kenya Standards and Guidelines for E-Health Systems Interoperability [ 47 ] acknowledge that the absence of interoperability standards over the years has led to the duplication of efforts and the inefficient use of ICT resources in health care. Now that ICT has become increasingly relevant in improving efficiency in health service delivery, the Kenya MOH recognizes the need to adopt a standardized approach, hence the development of interoperability standards for eHealth systems. In addition, the Health Sector ICT Standards and Guidelines [ 48 ] emphasize the relevance of interoperability as a requirement for addressing the inconsistency in implementing ICT in the health sector.

The Health Sector ICT Standards and Guidelines [ 48 ] consider “integration of ICT systems” as one of its key guiding principles, acknowledging the lack of information systems integration as a challenge experienced by ICT services across Kenya.

The contexts for inclusion and equitable access as generated from included standards [ 18 , 47 , 48 ] were (1) to promote inclusion and (2) to promote equitable access to services.

To promote inclusion, the standards [ 18 , 47 , 48 ] highlight the importance of involving and engaging multiple actors and stakeholders during the development process. However, no emphasis was placed on the need to sustain stakeholder engagement during the implementation process.

Pertaining to equitable access, the Kenya Standards and Guidelines for mHealth Systems [ 18 ] acknowledges that the public health care system is largely unavailable to most of the population in many developing countries because of geographical location, resource constraints, inefficiencies, and lack of awareness. Hence, it recognizes the importance of ensuring that mHealth services are always accessible by users and from anywhere as well as the need to put in place mechanisms to make this happen.

Process: How the Regulations Are Developed and Implemented

Two themes were generated from the included standards: development and implementation processes [ 18 , 47 , 48 ].

Development Process

All the included standards [ 18 , 47 , 48 ] indicate that they were developed through a participatory process and in consultation with a range of subject experts and interest groups. In addition, the standards [ 18 , 47 , 48 ] adopted a multisectoral approach to engage health-related stakeholders from government ministries or agencies and development partners and a range of subject experts and interest groups. It has also been reported that these standards [ 18 , 47 , 48 ] were developed based on international best practices and with reference to international standards. However, there is no indication that a stakeholder engagement strategy was adopted to sustain the engagement of stakeholders through the entire development and implementation process.

Implementation Process

The 3 regulatory standards [ 18 , 47 , 48 ] identify the key requirements to ensure effective implementation of IT services in the health sector. These are (1) legal authority, (2) coordination, (3) building capacity, and (4) monitoring and evaluation.

The included standards [ 18 , 47 , 48 ] were established based on the legal provisions enshrined in the health and other related acts and laws of Kenya as well as the relevant policies and strategies. Hence, it is expected that their implementation will comply with and be backed by those legal provisions. For example, the Health Sector ICT Standards and Guidelines [ 48 ] indicate that its implementation will be supported by the authority from the Kenya Communications Act 2009, E-Government Strategy, and National ICT Policy. Similarly, the Kenya Standards and Guidelines for mHealth Systems [ 18 ] asserts that it will be implemented by complying with existing and relevant national policies, legal frameworks, strategies, and standards, including the Health Information Policy, ICT Standards, and System Interoperability Principles.

The included standards [ 18 , 47 , 48 ] report that the implementation of regulations will require robust coordination mechanisms. For instance, the Health Sector ICT Standards and Guidelines [ 48 ] indicate that, as the Ministry’s ICT resource manager, the principal secretary (also the head of ICT), in collaboration with the ICT Governance Committee, is responsible for coordinating the implementation of the standard. The ICT Governance Committee comprises representatives from the heads of departments and ICT development partners in the health sector. The committee’s responsibilities include overseeing, enforcing, and reviewing standards as well as initiating ICT projects.

The Health Sector ICT Standards and Guidelines [ 48 ] highlight the need for capacity building or training of the MOH staff and stakeholders who are the primary users of the Ministry’s ICT services. This will enhance their capacity to implement the guidelines provided in the document in line with the ministry’s human resource development policies, regulations, and rules. However, it is acknowledged that building capacity for health ICT is a challenge given that there is low adoption of ICT among health providers, and ICT is not routinely included in the course content of most training programs. The Kenya Standards and Guidelines for mHealth Systems [ 18 ] listed the “number of mHealth practitioners trained on the standards and guidelines” as one of the indicators for monitoring and evaluating mHealth interventions.

The Health Sector ICT Standards and Guidelines [ 48 ] assert that monitoring and evaluation is an essential role of the MOH to ensure efficiency, accountability, and transparency throughout the implementation period. It further stresses that all those who use the Ministry’s ICT services are required to adhere to the provisions in the standard as the MOH will carry out quarterly monitoring exercises on the use of the standard to ensure compliance based on clear indicators. Furthermore, the ICT Governance Committee will periodically review and amend the standard to keep it relevant and effective. Similarly, the Kenya Standards and Guidelines for mHealth Systems [ 18 ] establishes the following key indicators for effectively monitoring and evaluating the implementation of the standards and guidelines: (1) the number of counties in which the MOH has disseminated the standards and guidelines, (2) the number of counties successfully implementing the standards and guidelines, (3) the number of mHealth practitioners trained on the standards and guidelines, (4) the number of mHealth practitioners accessing the standards and guidelines, (5) the number of mHealth practitioners who correctly understand the standards and guidelines, (6) the number of stakeholders who adhere to the standards and guidelines, (7) the number of mHealth systems that follow the required development steps, and (8) the number of mHealth practitioners who have implemented their systems by using the standards and guidelines. In addition, the Kenya Standards and Guidelines for mHealth Systems [ 18 ] indicates that the outlined standards will be reviewed every 3 years to ensure they are up to date with new changes including the changes in policies and systems upgrades.

Although all the abovementioned indicators are relevant, the implementation process is not explicit on the approach for regulating health apps and ensuring compliance with regulatory standards and guidance.

Actors: Those the Regulations Are Targeted at

The included standards [ 18 , 47 , 48 ] identified 2 main groups of actors for whom the regulations and guidance were targeted. They included (1) those who provide digital health services and (2) those who use the ICT infrastructure of the MOH.

Two of the standards [ 47 , 48 ] indicated that the regulations should be implemented by all individuals and organizations that provide ICT-related health care services to the public. Similarly, the Health Sector ICT Standards and Guidelines [ 48 ] state that all those who access or use the MOH ICT infrastructure are expected to adhere to the guidelines outlined in the document.

Mapping of Stakeholders

To address the third research question, we conducted a stakeholder mapping guided by the RISA tool [ 41 ].

A total of 11 categories of key stakeholders were identified from all 49 included documents (6 stand-alone regulatory standards and guidance, 35 national policies or strategies, and 8 other related documents). These categories are consistent with the digital health stakeholders recognized by the WHO, ITU, and UNESCO [ 32 , 33 , 43 ]. Table 2 presents the mapping of stakeholders according to their role categorization. A more detailed table with a potential role description with regard to regulating health apps for self-management is presented in Multimedia Appendix 4 .

a WHO: World Health Organization.

This paper presents the findings of a scoping review of regulatory standards and guidance for the use of health apps for self-management in sub-Saharan Africa. To the best of our knowledge, this is the first study that attempted to identify and assess the extent to which regulatory standards and guidance regulate and guide the use of health apps for self-management in sub-Saharan Africa as well as map out the key stakeholders and their potential roles.

Our findings reveal that only 1 country (Kenya) in sub-Saharan Africa currently has national regulatory standards that could potentially regulate the use of health apps for self-management. The included standards failed to adequately address adequate attention to inclusion and equitable access. This is concerning given the growing need to promote the adoption of culturally appropriate and relevant health apps and to ensure that they are available to those who need them regardless of gender, ethnicity, geographical location, or financial status [ 24 - 29 ]. Consequently, this review provides insights into the regulation of health apps for self-management in sub-Saharan Africa, which needs to be given more attention if the potential of these apps is to be harnessed in the region.

Principal Findings

We identified 49 documents from 31 countries in sub-Saharan Africa. Although none of the included standards provided a specific set of regulations on health apps for self-management, we identified 3 standards [ 18 , 47 , 48 ] that provided relevant information regarding the regulation of health apps. The included national policies and strategies, in contrast, only outline the goals and commitments made by national governments to promote the adoption of digital technologies in the health sector and the plans and paths set forth to achieve these goals. However, the information they provided was relevant for identifying and mapping digital health stakeholders who potentially have vital roles in regulating the use of health apps for self-management.

The policy analysis framework (content, context, process, and actors) [ 42 ] was adapted and applied to organize the key findings. The content covered the following areas: guidance on systems quality; guidance on software and app development, acquisition, support, and maintenance; security measures for adequate protection of patients’ digital records; guidance on data exchange; interoperability as a basic requirement; minimum standards to enable integration; involvement and engagement of relevant stakeholders; and system attributes for equitable access to services. Meanwhile, the context was to address unsafe, isolated, and inconsistent implementation; to build mutual trust and maximize the benefits of eHealth information exchange; to address poor coordination, duplication of efforts, and inefficient use of resources; to promote the integration of ICT systems; and to promote inclusion and equitable access to services. The process involved the development process (which covers participatory and consultative processes and multisectoral approach, with reference to international standards and best practices) and the implementation process (which covers legal authority, coordination, capacity building, and monitoring and evaluation). The targeted actors were those who provided digital health services and those who used the ICT infrastructure of the MOH.

Furthermore, key stakeholders with potential roles in regulating health apps for self-management were identified. They include the government, regulatory bodies, funders, intergovernmental and nongovernmental organizations, academia, and the health care community.

Implications of the Study Findings for Practice

Regulatory standards and guidance act as a bridge between technological innovation and its safe and effective use in health care. They ensure that while technology continues to advance, the safety and trust of patients are never compromised. Among the plethora of health apps on the market, the over-the-counter, nonregulated apps such as wellness and fitness apps are the most mainstream [ 93 - 95 ]. On the other side of the spectrum, there are regulated health apps that are classified under medical devices or software as medical device products [ 94 , 95 ]. Some of these are prescription-only apps, such as digital therapeutics (DTx) apps for managing substance dependence [ 95 , 96 ].

Although some high-income countries have made significant strides in ensuring the safety, effectiveness, and accessibility of health apps, the journey has indeed not been without challenges and hurdles. Sub-Saharan Africa, although dealing with its own unique set of challenges, has the opportunity to learn from the experiences of these high-income countries. This could potentially allow the region to bypass some of the hurdles encountered by high-income countries in their journeys.

Technical and clinical safety are essential requirements that health apps must meet before they can be considered for use for self-management to minimize the risk of harm to patients. It is well documented that health apps that function poorly pose a serious threat to the safety of patients. An example illustrating how health apps used for self-management can threaten patient safety is evident in a study [ 12 ]. This study [ 12 ] revealed that widely used health apps designed to calculate and estimate insulin doses could endanger patients by providing incorrect or inappropriate dose recommendations. Similarly, 2 successive studies that assessed the contents and tools of apps for asthma discovered that none of the apps in the first study offered comprehensive information or adequate tools for asthma self-management, whereas the follow-up study, which was conducted 2 years later, showed a 2-fold increase in the number of asthma apps, yet there was no improvement in the content and tools offered by the newer apps. In fact, many apps recommended self-management procedures that were not supported by evidence [ 13 , 14 ]. Accordingly, some health apps that support the self-management of long-term conditions do not adhere to evidence-based guidelines and are unresponsive to the evolving health needs of patients.

Although the context of included regulatory standards with regard to technical and clinical safety was to address unsafe, isolated, and inconsistent implementation, the guidance provided by these regulatory standards is not specific to health apps, and they do not provide appropriate guidance and standards for health organizations and other key stakeholders to establish a framework for managing the clinical risks associated with deploying and implementing self-management health apps. Considering the rapid advancements in digital health (including artificial intelligence [AI] or machine learning and big data), health apps will increasingly play a crucial role in supporting self-management through digitally enabled care pathways that will improve personalized care and health outcomes [ 97 , 98 ]. Therefore, it is imperative to ensure the technical reliability and clinical safety of health apps for self-management through robust regulatory standards and guidance. For instance, a guide on the criteria for health app assessment, developed by the UK government, includes technical stability and clinical safety as criteria for deciding whether health apps should be considered for use in the National Health Service (NHS) [ 99 ]. In addition, medical device apps are required to conform to the NHS clinical risk management standards as part of the clinical safety requirements [ 99 , 100 ]. In the event of any concerns regarding the safety of a medical device app, the Yellow Card reporting system can be used by a responsible clinical safety officer or any other individual to notify the Medicines and Healthcare products Regulatory Agency (MHRA) [ 101 , 102 ].

To adequately manage patient information when health apps are used for self-management, data protection and security standards and guidance are required. They guarantee that data are kept and handled safely and responsibly within the provisions of the law and that patients’ rights and interests are respected.

There have been ongoing concerns about compliance with ethical standards, the principles of confidentiality of information, and data privacy. For example, an assessment of apps that had previously been endorsed by the former UK NHS Apps Library revealed substantial gaps in compliance with data protection principles regarding the collection, storage, and transmission of personal information. This has raised a fundamental concern about the credibility of developer disclosures and whether these disclosures can be trusted by certification programs [ 15 ]. A study assessed the privacy practices of the 36 most popular apps for depression and smoking cessation for Android and iOS in the United States and Australia [ 16 ]. The findings revealed that although only 69% (25/36) of the apps included a privacy policy, 92% (33/36) of the apps shared data with a third party, and only 92% (23/25 with privacy policy) of the apps disclosed sharing data with a third party in their policy. Although 81% (29/36) of the apps shared data with Google and Facebook for the purposes of advertising, marketing, or analytics, only 43% (12/28) of the apps that shared data with Google and 50% (6/12) of the apps that shared data with Facebook disclosed this in their policy [ 16 ].

In this regard, health app developers and providers in the United Kingdom are required to conduct a data protection risk assessment before they launch or update their apps to ensure compliance with the United Kingdom General Data Protection Regulation (GDPR) and other relevant regulations, including the Data Protection Act 2018 [ 103 ]. By conducting a data protection risk assessment, health app developers and providers can demonstrate that they are accountable; they respect the privacy and dignity of their users; and that they deliver safe, effective, and ethical solutions [ 104 ].

Health apps are expected to play an increasingly important role in supporting self-management. However, this ambition can only be achieved if citizens trust that these apps are collecting and analyzing data safely and in accordance with robust regulatory standards and guidance. It is also crucial that these apps provide reliable information that clinicians can act on [ 98 ]. The context of the standards included in this study regarding data protection and security was to build mutual trust and maximize the benefits of eHealth information exchange. Trust is a key factor in the successful adoption and use of health apps, and transparency in data handling and clinical decision-making is essential to build and maintain that trust. This is also paramount for the widespread acceptance and impact of health apps on health care outcomes in sub-Saharan Africa.

We acknowledge the existence of numerous national laws related to data protection and security outside the health sector. Hence, guidelines that link these legislations together must be provided to ensure compliance with all relevant laws and guidance when using patient data. An example of how to achieve this is the United Kingdome’s guide to good practice for digital and data-driven health technologies that provides guidelines on how to abide by the laws and principles that govern data security and protection in the United Kingdom, including the GDPR, Data Protection Act 2018, and Caldicott Principles [ 105 ].

Standards and interoperability are essential for effectively developing, deploying, and implementing health apps to support self-management in sub-Saharan Africa. Interoperability is the ability of different systems, devices, or applications to communicate and exchange data with each other in a coordinated manner, thus providing timely and seamless portable information across organizational, regional, and national boundaries and optimizing both individual and population health [ 106 ]. In the same vein, standards enable interoperability between systems or devices through a common language and a common set of expectations [ 106 ].

Interoperability is crucial in improving the quality, safety, and efficiency of care delivery as well as empowering patients and providers with access to relevant and timely information [ 99 ]. One of the most widely used and accepted interoperability standards for health care data exchange is FHIR [ 106 , 107 ]. FHIR is a global industry standard developed by HL7 International. FHIR is designed to be quick to learn and implement and to support a variety of use cases, including self-management [ 108 ]. By using apps that are based on an FHIR standard, patients can benefit from data analytics that show how their health data relate to their chronic conditions or wellness goals [ 109 ]. They could also access all their health information from one place, even if they visit different health professionals who use different electronic medical records or EHR, thus promoting integrated care [ 28 , 31 , 33 , 109 - 115 ]. As a result, patient care can easily be coordinated.

The context of the included regulatory standards with regard to standards and interoperability was to address poor coordination, duplication of efforts, and inefficient use of resources and to promote the integration of ICT systems. However, in sub-Saharan Africa, there are many challenges and barriers to the adoption and implementation of interoperability standards, such as the lack of awareness or knowledge of the benefits and requirements of interoperability standards among stakeholders; lack of incentives or regulations to encourage or enforce the adoption of interoperability standards by app developers and vendors; lack of resources or capacity to implement interoperability standards, including technical expertise, infrastructure, funding, or governance; and lack of alignment or coordination among the different actors and initiatives involved in developing, deploying, and implementing the digital health interventions [ 30 , 116 - 119 ]. To address these challenges, some possible solutions may include raising awareness and education on the importance and value of interoperability standards for health apps among all relevant actors; developing and implementing policies and guidelines that promote or mandate the use of interoperability standards by app developers and vendors; providing technical assistance and support for app developers and vendors to adopt and implement interoperability standards, such as tools, frameworks, testing, certification, or accreditation; and establishing and strengthening collaboration and coordination among the different stakeholders and initiatives involved in health app development, deployment, and implementation in sub-Saharan Africa. In addition, the Digital Health Platform Handbook, a toolkit developed by the collaborative efforts of the WHO and ITU [ 120 ], can help countries in sub-Saharan Africa to develop and implement digital health platforms as the underlying infrastructure for interoperable and integrated national digital health systems. The digital health platform is a system-wide approach to developing digital health solutions with the aim to overcome the problems of siloed, vertical, and isolated applications and systems that hamper data management, innovation, efficiency, and impact in the health sector.

Inclusion and equitable access are crucial to ensuring that health apps and related services are culturally appropriate and relevant as well as accessible to all who need them, regardless of gender, ethnicity, geographical location, ability, or financial status [ 24 - 29 ]. This is the key to promoting a “sense of belonging” and “ownership” and thus underscoring the importance of stakeholder mapping and involvement or engagement through the development and implementation process [ 22 ].

In this study, the included regulatory standards demonstrate the importance of inclusion by adopting both a participatory and consultative approach involving multiple stakeholders from different sectors. However, the standards do not provide clear guidance to ensure the adequate participation and sustained engagement of all relevant stakeholders. The lack of concise guidance to ensure the adequate participation and engagement of all relevant stakeholders, especially the susceptible and disadvantaged groups, can increase the risk of tokenistic tendencies, which can undermine the cultural appropriateness of health apps [ 25 , 121 ]. Some susceptible groups, such as women and people with low socioeconomic status, may face additional barriers to accessing and using health apps, such as lack of digital literacy, privacy concerns, cultural norms, or stigma [ 25 ]. Similarly, the cost of developing, maintaining, and updating health apps may not be covered by public or private health insurance schemes, which could limit their affordability and availability for low-income or uninsured populations [ 95 ]. However, there is no specific guidance or model for an effective funding mechanism for health apps in the included regulatory standards.

To address these challenges and ensure equitable access to health apps for self-management in sub-Saharan Africa, possible measures may include developing policies and regulations that support integrating health app interventions into existing health systems and financing mechanisms and engaging with stakeholders from different sectors and backgrounds (including health professionals, patients, communities, governments, civil society, academia, and industry) to co-develop and co-implement frameworks or models that promote the use of health apps for self-management in ways that are responsive to the local context and needs. Moreover, establishing regulations that provide appropriate financing or reimbursement options will reduce the risk of developers of good quality health apps turning to data mining for revenue, thus increasing privacy concerns [ 95 ]. For instance, in Germany, the reimbursement of health apps classified as medical devices (Digitale Gesundheitsanwendungen) was introduced in 2021 under the statutory health insurance [ 122 , 123 ]. When a medical device is prescribed by a physician or a physiotherapist, the manufacturer must submit an application to the German Federal Institute for Drugs and Medical Devices (Bundesinstitut für Arzneimittel und Medizinprodukte) for approval [ 123 ]. The Federal Association of the Statutory Health Insurance Funds (Spitzenverband Bund der Krankenkassen) determines and negotiates the reimbursement thresholds following approval. However, the manufacturer must demonstrate that the app is safe, functional, and of good quality; complies with data protection requirements; and benefits patient care [ 123 ].

The process of regulating health apps essentially involves the development and implementation of regulatory standards and guidance. According to our study, the development process comprises a participatory and consultative process, a multisectoral approach, and a reference to international standards and best practices. In contrast, the implementation process is ongoing and requires appropriate legal authority, coordination, capacity building, and monitoring and evaluation.

We recognize that health apps can be accessed and used by patients from different parts of the world, and this means that countries need to carefully consider whether health apps that are accessed and used by their citizens meet the national or regional legal and ethical requirements, including their cultural and linguistic needs [ 23 ]. For countries in sub-Saharan Africa, a cross-border or regional collaboration between national legal authorities through the coordination of agencies such as the African Medicines Regulatory Harmonization (AMRH) may help to ensure that health apps built for the region are safe, effective, and user-friendly for everyone, considering the contextual differences of the countries [ 23 ]. For instance, all medical device companies that want to sell their products in the European market must obtain a Conformité Européenne (CE) mark for their devices, which indicates that they meet the legal requirements and can be freely circulated within the European Union [ 124 ]. Although the European Union member states regulate medical devices, the European Medicines Agency is involved in the regulatory process.

The regulation of health apps is extremely complex and involves a wide range of stakeholders. Therefore, a robust coordination mechanism is essential to reduce the risk of fragmentation and duplication of efforts and to promote the efficient use of resources. Most countries in sub-Saharan Africa have units in health ministries that coordinate and oversee the regulation of medical products. These units should be autonomous, full-fledged departments with legal authority (boards or commissions) to ensure independent, transparent, and accountable decision-making, but this is often not the case [ 125 ]. These units are recognized by the national authorities as regulators (eg, the National Medicines Regulatory Authority [NMRA]) [ 126 ]. Such organizational structures hinder the effectiveness of the national regulatory authorities in fulfilling their mandate and prevent the establishment of quality management systems to ensure transparent and accountable decision-making [ 125 ].

Furthermore, Essén et al [ 23 ] analyzed health app policy or regulation in 9 high-income countries (Sweden, Norway, Denmark, Netherlands, Belgium, Germany, England, the United States, and Singapore) and found that most of these countries adopted centralized approaches to app evaluation. Although centralized approaches might have advantages over self-evaluation, they may create bottlenecks and limit the availability of high-quality health apps for users. As suggested by Essén et al [ 23 ], a decentralized approach, such as the accreditation of evaluation agencies, maybe a worthwhile solution. However, this will require adequate coordination to ensure the consistency and reliability of the evaluation criteria and methods across different agencies as well as the transparency and accountability of the accreditation process. A possible way to achieve this is to adopt a common framework that can guide the evaluation and accreditation of health apps.

Similarly, the postmarket surveillance (PMS) system, which is a new regulation for medical devices in Europe, is a process of collecting and analyzing data on medical devices after they have been launched into the market to ensure their safety and performance and to identify any problems or need for improvements [ 127 , 128 ]. The PMS system is important because premarket data, which are obtained from testing a medical device before it is launched, have limitations in capturing the long-term performance and risks of the device [ 128 ]. Currently, the PMS system does not cover fitness and wellness apps, which are commonly used in self-management. Hence, Yu [ 93 ] proposed that the PMS system should also be applied to DHTs, such as fitness and wellness apps. They argue that the postmarket data would help regulators periodically review and adjust the regulatory standards for these groups of health apps based on their risks and benefits.

Drawing on the experience of the United Kingdom, it can be clearly demonstrated that the regulation of health apps is a complex, a multifaceted, and an evolving process that involves different regulators and criteria depending on the nature and function of the app. For instance, a centralized NHS Apps Library was launched as a beta site in April 2017 to provide patients with a collection of trusted and easy-to-use digital health tools [ 129 ]. The library provided access to a range of health apps that were reviewed and approved by the NHS, including apps that could help patients manage conditions such as diabetes, mental health, and chronic obstructive pulmonary disease [ 130 ]. However, the library was closed in December 2021 [ 131 ]. Although no reason for the closure was provided on the website, it is likely because of persistent concerns regarding the safety of patients and data privacy involving multiple apps including those listed in the library [ 12 , 14 - 16 , 131 , 132 ]. The NHS App was introduced in January 2019 before the closure of the NHS Apps Library to serve as the gateway for accessing NHS services including ordering repeat prescriptions and booking or managing appointments [ 133 ].

Furthermore, the United Kingdom Health Security Agency, formerly known as Public Health England, issued a guidance on criteria for health app assessment in October 2017 [ 99 ]. The purpose of this guidance was to ensure that all health apps built for the UK population work well and provide clear information about their functions, benefits, and intended outcomes for patients and health care professionals. On the basis of this guidance, those intending to build an app are required to conform to certain regulations before being considered for the app assessment process. The 2 main regulations are the medical device regulation and the Care Quality Commission (CQC) registration. Apps that are considered as medical devices must register with the MHRA and have a CE mark. Apps providing health or social care that fit into 1 of 14 regulated activities are required to register with the CQC before they can be assessed [ 134 ]. CQC is an independent regulator of health and social care services in England.

Similarly, the Organisation for the Review of Care and Health Apps (ORCHA) is a UK-based organization that independently evaluates and distributes health apps. It provides services such as app review, accreditation, curation, and recommendation within the United Kingdom and across the world [ 135 ]. ORCHA also enables organizations (including the NHS) to build a decentralized web-based digital health library of consumer-friendly over-the-counter apps [ 135 - 137 ]. These apps are continuously assessed by ORCHA against the standards and regulations in clinical and professional assurance, data quality and privacy, and usability and accessibility [ 137 ].

In addition, the Digital Technology Assessment Criteria (DTAC) were introduced in beta in October 2020, and its first official version was subsequently launched in February 2021 [ 138 ]. The DTAC plays a crucial role in ensuring that digital health tools meet the necessary standards in areas such as clinical safety, data protection, technical security, interoperability, usability, and accessibility. By serving as the national baseline criteria for DHTs in the NHS and social care, it provides a valuable framework for health care organizations during procurement. It also offers guidance for developers on the expectations for their digital technologies within the NHS and social care. This is an example of how a harmonized framework can help ensure the quality and safety of DHTs, including health apps.

In addition, the National Institute for Health and Care Excellence Evidence Standards Framework is a set of evidence standards for a wide range of DHTs designed to help evaluators and decision makers in the health care system to consistently identify DHTs that are likely to offer benefits to the users and the health care system [ 139 ]. The Evidence Standards Framework was first published in March 2019 and is ideally used before DHTs (including health apps) are considered for commissioning or procurement by the NHS [ 140 ]. It is a crucial tool for ensuring that DHTs are clinically effective and offer value to the health and care system in the United Kingdom. In August 2022, the framework was updated to include AI and data-driven technologies with adaptive algorithms [ 140 ].

Furthermore, DTx apps, which are a type of medical device, are not allowed into the UK market unless they comply with the UK GDPR and meet the requirements of DTAC. In addition, they must bear the CE or UK Conformity Assessed marks [ 141 ]. This means that DTx apps must demonstrate their safety and efficacy through clinical trials and comply with the relevant regulations for data protection and quality standards as regulated by the MHRA. DTx products are also recognized as DHTs under the National Institute for Health and Care Excellence Evidence Standards Framework [ 142 ]. DTx incorporates software to treat, prevent, or manage specific diseases or conditions [ 143 , 144 ]. The fact that DTx products typically focus on a narrow clinical indication and generate evidence of clinical efficacy underscores their potential to make a substantial contribution to self-management and health care delivery in general. The increasing recognition of the role of DTx in patient care by regulators is also noteworthy, and the creation of regulatory and reimbursement pathways for approved apps further enables DTx products to continue to play an important role in impacting health care delivery [ 1 , 143 ]. This is a testament to the potential of regulated health apps to revolutionize health care and improve patient outcomes.

Among the many lessons to learn from the experience of the United Kingdom is that the regulation of health apps must evolve to keep pace with advances in DHTs and adapt to the changing needs and demands of digital health. Moreover, efforts are being made to streamline the multifaceted approaches to simplify app regulation and access in the United Kingdom [ 23 ]. Therefore, a robust and dynamic coordination mechanism, along with political will, skilled personnel, reliable funding, and a robust framework for monitoring and evaluating progress and aligning key performance indicators, is essential for countries in sub-Saharan Africa to keep pace with the advancement in the regulation of health apps. There is also a need to strengthen collaboration and ensure regulatory harmonization among national regulatory authorities and continental bodies such as the regional economic communities, AMRH, and the WHO AFRO [ 126 ].

Capacity building and monitoring and evaluation are important factors for ensuring effective regulation of health apps given the complex nature of the process. The regulation of medical products (including health apps) in sub-Saharan Africa generally includes licensing and accreditation, evaluation, inspection, quality control, information dissemination and promotion, and monitoring of adverse events [ 125 ]. Therefore, high-level skills as well as effective monitoring and evaluation will be required to ensure the success of the process. For most countries in sub-Saharan Africa, the NMRA is responsible for coordinating and overseeing the regulatory system of medical products [ 125 , 126 ]. However, in most cases, NMRAs are unable to perform the core regulatory functions expected of them [ 145 ]. More than 90% of African countries have limited or no capacity to regulate medical products, with only 7% having moderately developed capabilities [ 145 ]. The lack of effective NMRAs in Africa exposes the citizens to potential harm by allowing unsafe, low-quality, and fake medical products to circulate and be used [ 145 ].

Although it is the responsibility of governments to establish functional regulatory systems and ensure effective monitoring and evaluation of the regulatory process, the involvement of international and continental organizations to support sub-Saharan African countries improve the regulatory capacity of their national regulatory agencies would be extremely beneficial. For instance, the African Medicines Agency (AMA) was established in November 2019 as a treaty adopted by the African Union Member States to help address the concerns arising from weak regulatory systems on the continent. At present, 37 countries have signed the AMA treaty, including 26 countries that have ratified it [ 146 ]. The main objective of the AMA is to enhance the capacity of States Parties and regional economic communities to regulate medical products to improve the quality, safety, and efficacy of medical products on the continent [ 147 ]. The AMA, in collaboration with other existing capacity building initiatives or organizations, such as the WHO Global Initiative on Digital Health, ITU, AMRH, WHO AFRO, and United Nations Children’s Fund, can assist sub-Saharan African countries in aligning their regulatory requirements with available resources and support them to acquire the necessary tools and skills to build effective and sustainable regulatory systems for health apps. This can be achieved by adopting a decentralized approach to engage a network of technical experts across the African Union similar to the model of the European Medicines Agency [ 148 ].

Actors or Stakeholders

The regulation of health apps often requires working with a wide range of actors or stakeholders. However, in this review, we identified only 2 main actor groups (those who provide digital health services and those who use the ICT infrastructure of the health ministry). These are the groups that are targeted by the included regulatory standards.

From a broader perspective, 12 categories of stakeholders according to their potential role in regulating health apps for the self-management were mapped in this study. The potential contribution of these stakeholders to the regulation of health apps for self-management in sub-Saharan Africa not only depends on their roles and responsibilities but also on their interests, needs, expectations, and influence [ 41 , 149 - 151 ]. Thus, a robust stakeholder analysis is paramount as it can help define the scope of the regulatory process, prioritize the requirements, manage the expectations, and ensure the engagement and participation of stakeholders throughout the regulatory process [ 41 , 152 - 156 ]. Our stakeholder mapping, as presented in Table 2 (refer to Multimedia Appendix 4 for more details), lays the foundation for national governments to conduct a robust stakeholder analysis and to adopt an all-inclusive stakeholder engagement strategy to manage and sustain the engagement and participation of all relevant stakeholders [ 157 , 158 ].

Recommendations

Our review found that the regulation of health apps in sub-Saharan Africa is especially poor and almost nonexistent, as only Kenya has national standards that could address some of the regulatory issues related to health apps. Therefore, we recommend the following actions to help sub-Saharan African countries improve the regulation of health apps to support self-management:

  • Establish a clear and consistent definition of what constitutes a health app (considering AI or machine learning) and what level of regulation is required for different types of apps.
  • Develop and implement criteria and guidelines that ensure the quality, safety, and usability of health apps.
  • Engage with independent app evaluators, such as ORCHA, to adopt a common framework that can guide the evaluation and accreditation of health apps and use the framework to create and maintain decentralized and transparent platforms that showcase and evaluate health apps for users and health care professionals.
  • Develop and implement policies and regulations that enable sustainable funding for health apps such as integrating the use of health apps for self-management into existing health systems and financing pathways or mechanisms.
  • Support and facilitate innovation and collaboration across the sub-Saharan Africa region, especially in areas including but not limited to data security and privacy, interoperability standards, usability, accessibility, funding, capacity building, and monitoring and evaluation of the regulatory process.
  • Manage and sustain the engagement, involvement, and participation of all relevant stakeholders in the regulatory process by conducting a robust stakeholder analysis and adopting an all-inclusive stakeholder engagement strategy.

Strengths and Limitations of the Study

This study has several strengths, which include an extensive search of gray literature and repositories, contact with key individuals, and the use of a systematic approach. Given that regulatory standards and guidance are unavailable in scientific databases, a wide range of gray literature and repositories were searched. In addition, contact was made with key staff members to obtain relevant documents, including those at the MOHs, the WHO country offices, and the WHO AFRO. Second, to enhance the strength of the study, a policy analysis framework was adapted and used to systematically organize the key study findings, whereas a deductive descriptive qualitative content analysis approach was used to identify and analyze texts that contained relevant concepts and other related information based on the 4 predefined themes. Third, the RISA tool was used to guide the mapping of key stakeholders. This has further increased the robustness of the study findings.

The limitations of this study include the fact that our literature search was conducted in English. Although the literature search was conducted in English, it yielded documents written in French and Portuguese from the ICTworks repository. Second, regulatory standards and guidance are not readily available on scientific databases; hence, it is possible that some relevant documents might have been missed. However, efforts were made to obtain these documents by contacting key stakeholders including key contact persons at the WHO AFRO, WHO country offices, and MOHs. In addition, contacting key individuals only for the purposes of requesting documents rather than conducting direct interviews was one of the limitations of this study. Interviewing key contact persons and stakeholders to obtain additional information could have strengthened the review; however, we did not interview any key individuals or stakeholders because it was beyond the scope of this review. Nonetheless, we recommend that future studies consider incorporating interviews to explore the perspectives of key stakeholders.

Conclusions

Health apps are increasingly being used by patients to manage their health, and sub-Saharan African countries can leverage these apps to advance their progress toward achieving SDG 3 (good health and well-being) and UHC, especially given the rapid advancement of AI and big data. However, our study has established that the regulation of health apps in sub-Saharan Africa is inadequate to ensure that health apps are technically reliable and clinically safe; interoperable across systems; compliant with the principles of confidentiality of information and data privacy; culturally appropriate and relevant; and accessible to everyone regardless of gender, ethnicity, location, or income. Therefore, the region can learn from the experiences of some high-income countries such as the United Kingdom and Germany to develop and implement a robust and responsive regulatory system that supports the widespread adoption of safe, effective, and beneficial health apps for its population.

Following the publication of this review, a summary of the findings will be disseminated to the relevant organizations. In addition, the key findings will be summarized and presented at national, regional, and international conferences.

Acknowledgments

The authors would like to thank Rebecca Jones, the Library Manager and Liaison Librarian at Charing Cross Library, who advised and assisted with the search strategy for this study. This work is part of the PhD research of BAB, which is sponsored by the government of Nigeria. AM and JC were supported by the National Institute for Health and Care Research (NIHR) Applied Research Collaboration Northwest London (NIHR200180). The views expressed in this publication are those of the authors and not necessarily those of the government of Nigeria or the NIHR or the Department of Health and Social Care. In the Results and Discussion sections, Microsoft Copilot in Bing [ 159 ] was used to help summarize and modify a few texts as well as suggest some citations.

Data Availability

The search strategy for PubMed, Scopus, and the World Health Organization AIM is presented in Multimedia Appendix 1 . All data generated or analyzed during this study are included in this published article (and its supplementary information files). The documents analyzed are available directly from the relevant institutional websites, ICTworks repository [ 44 ] or upon request from the relevant government departments in each country. Additionally, documents in the list of references that are not accessible on the web can be solicited from the corresponding author on reasonable request.

Authors' Contributions

BAB and JC conceived the study. BAB designed the study with contributions from JC and NM. BAB drafted the manuscript, and JC, NM, AM, SI, KPF, BIH, and NU read and contributed to it. AM was the clinical lead, and JC acted as a guarantor for this study. The final manuscript was read and approved by all the authors.

Conflicts of Interest

None declared.

PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews) checklist.

Database search strategy.

Details of included documents.

Mapping of the stakeholders according to their potential role in regulating health apps for self-management.

  • Aitken M, Nass D. Digital health trends 2021: innovation, evidence, regulation, and adoption. IQVIA Institute for Human Data Science. 2021. URL: https:/​/www.​iqvia.com/​-/​media/​iqvia/​pdfs/​institute-reports/​digital-health-trends-2021/​iqvia-institute-digital-health-trends-2021.​pdf?&_=1669449368070 [accessed 2022-11-26]
  • Mobile app threat landscape report. RiskIQ. 2020. URL: https://www.riskiq.com/2020-mobile-threat-landscape-report-thank-you/ [accessed 2021-07-19]
  • El-Sappagh S, Ali F, Hendawi A, Jang JH, Kwak KS. A mobile health monitoring-and-treatment system based on integration of the SSN sensor ontology and the HL7 FHIR standard. BMC Med Inform Decis Mak. May 10, 2019;19(1):97. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Labrique AB, Vasudevan L, Kochi E, Fabricant R, Mehl G. mHealth innovations as health system strengthening tools: 12 common applications and a visual framework. Glob Health Sci Pract. Aug 06, 2013;1(2):160-171. [ FREE Full text ] [ CrossRef ]
  • Adepoju IOO, Albersen BJA, De Brouwere V, van Roosmalen J, Zweekhorst M. mHealth for clinical decision-making in sub-Saharan Africa: a scoping review. JMIR Mhealth Uhealth. Mar 23, 2017;5(3):e38. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Vegesna A, Tran M, Angelaccio M, Arcona S. Remote patient monitoring via non-invasive digital technologies: a systematic review. Telemed J E Health. Jan 2017;23(1):3-17. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Use of appropriate digital technologies for public health: Report by the Director-General. World Health Organization. 2016. URL: https://iris.who.int/handle/10665/274134 [accessed 2023-05-06]
  • El-Osta A, Rowe C, Majeed A. Developing a shared definition of self-driven healthcare to enhance the current healthcare delivery paradigm. J R Soc Med. Nov 2022;115(11):424-428. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Hussein R. A review of realizing the universal health coverage (UHC) goals by 2030: part 2- what is the role of eHealth and technology? J Med Syst. Jul 2015;39(7):72. [ CrossRef ] [ Medline ]
  • Sustainable development goal 3: Ensure healthy lives and promote well-being for all at all ages. United Nations. URL: https://sdgs.un.org/goals/goal3 [accessed 2023-05-07]
  • Coronavirus: apps to help the elderly. Organisation for the Review of Care and Health Apps. 2020. URL: https://orchahealth.com/coronavirus-apps-to-help-the-elderly/ [accessed 2021-07-19]
  • Huckvale K, Adomaviciute S, Prieto JT, Leow MKS, Car J. Smartphone apps for calculating insulin dose: a systematic assessment. BMC Med. May 06, 2015;13:106. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Huckvale K, Car M, Morrison C, Car J. Apps for asthma self-management: a systematic assessment of content and tools. BMC Med. Nov 22, 2012;10:144. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Huckvale K, Morrison C, Ouyang J, Ghaghda A, Car J. The evolution of mobile apps for asthma: an updated systematic assessment of content and tools. BMC Med. Mar 23, 2015;13:58. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Huckvale K, Prieto JT, Tilney M, Benghozi PJ, Car J. Unaddressed privacy risks in accredited health and wellness apps: a cross-sectional systematic assessment. BMC Med. Sep 07, 2015;13:214. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Huckvale K, Torous J, Larsen ME. Assessment of the data sharing and privacy practices of smartphone apps for depression and smoking cessation. JAMA Netw Open. Apr 05, 2019;2(4):e192542. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ndlovu K, Mars M, Scott RE. Interoperability frameworks linking mHealth applications to electronic record systems. BMC Health Serv Res. May 13, 2021;21(1):459. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Kenya standards and guidelines for mHealth systems. Kenya Ministry of Health. 2017. URL: https:/​/www.​health.go.ke/​wp-content/​uploads/​2020/​02/​Revised-Guidelines-For-Mhealth-Systems-May-Version.​pdf [accessed 2023-03-21]
  • Standard for electronic health record system (EHRs) in Ethiopia. Ethiopia Minister of Health. 2021. URL: https:/​/registry.​betterehealth.eu/​ehealth-policies/​standard-electronic-health-record-system-ehrs-ethiopia [accessed 2023-04-21]
  • National health normative standards framework for digital health interoperability in South Africa. South Africa Department of Health. 2021. URL: https://www.health.gov.za/wp-content/uploads/2022/10/HNSF_Gazette_21_October_2022.pdf [accessed 2023-05-15]
  • Ferretti A, Ronchi E, Vayena E. From principles to practice: benchmarking government guidance on health apps. Lancet Digit Health. Jun 2019;1(2):e55-e57. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Diao JA, Venkatesh KP, Raza MM, Kvedar JC. Multinational landscape of health app policy: toward regulatory consensus on digital health. NPJ Digit Med. May 11, 2022;5(1):61. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Essén A, Stern AD, Haase CB, Car J, Greaves F, Paparova D, et al. Health app policy: international comparison of nine countries' approaches. NPJ Digit Med. Mar 18, 2022;5(1):31. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Brown SA, Garcia AA, Kouzekanani K, Hanis CL. Culturally competent diabetes self-management education for Mexican Americans: the Starr County border health initiative. Diabetes Care. Feb 2002;25(2):259-268. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chaney SC, Mechael P. Self-Care Trailblazer Group. 2020. URL: https://media.psi.org/wp-content/uploads/2020/09/31000510/Digital-Self-Care-Final.pdf [accessed 2021-05-20]
  • Kanzaveli T. Healthcare: shiftingfrom “one size fits all” to “one size fits one”. Medium. 2017. URL: https:/​/tkanzaveli.​medium.com/​healthcare-shifting-from-one-size-fits-all-to-one-size-fits-one-d56136ded705 [accessed 2022-03-04]
  • Myth 1 – one app will fit all!. Organisation for the Review of Care and Health Apps. URL: https://orchahealth.com/myth-1-one-app-will-fit-all/ [accessed 2022-03-04]
  • Aitken M, Lyle J. Patient adoption of mHealth: use, evidence and remaining barriers to mainstream acceptance. IQVIA Institute for Human Data Science. Sep 2015. URL: https://www.iqvia.com/-/media/iqvia/pdfs/institute-reports/patient-adoption-of-mhealth.pdf [accessed 2021-05-21]
  • Mechael P, Batavia H, Kaonga N. Barriers and gaps affecting mhealth in low and middle income countries: policy white paper. Center for Global Health and Economic Development Earth Institute, Columbia University. 2010. URL: http://www.globalproblems-globalsolutions-files.org/pdfs/mHealth_Barriers_White_Paper.pdf [accessed 2021-03-24]
  • Bene BA, Ibeneme S, Fadahunsi KP, Harri BI, Ukor N, Mastellos N, et al. Regulatory standards and guidance for the use of health applications for self-management in Africa: scoping review protocol. BMJ Open. Feb 11, 2022;12(2):e058067. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Aitken M, Gauntlett C. Patient apps for improved healthcare: from novelty to mainstream. IMS Institute for Healthcare Informatics. 2013. URL: https://ignacioriesgo.es/wp-content/uploads/2014/03/iihi_patient_apps_report_editora_39_2_1.pdf [accessed 2024-03-10]
  • National eHealth Strategy Toolkit. World Health Organization, International Telecommunication Union. 2012. URL: https://www.itu.int/pub/D-STR-E_HEALTH.05-2012 [accessed 2021-06-28]
  • Global strategy on digital health 2020-2025. World Health Organization. 2021. URL: https://www.who.int/docs/default-source/documents/gs4dhdaa2a9f352b0445bafbc79ca799dce4d.pdf [accessed 2021-06-23]
  • Arksey H, O'Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. Feb 2005;8(1):19-32. [ FREE Full text ] [ CrossRef ]
  • Anderson S, Allen P, Peckham S, Goodwin N. Asking the right questions: scoping studies in the commissioning of research on the organisation and delivery of health services. Health Res Policy Syst. Jul 09, 2008;6:7. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Levac D, Colquhoun H, O'Brien KK. Scoping studies: advancing the methodology. Implement Sci. Sep 20, 2010;5:69. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Peters MDJ, Marnie C, Tricco AC, Pollock D, Munn Z, Alexander L, et al. Updated methodological guidance for the conduct of scoping reviews. JBI Evid Synth. Oct 2020;18(10):2119-2126. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Tricco AC, Lillie E, Zarin W, O'Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): checklist and explanation. Ann Intern Med. Oct 02, 2018;169(7):467-473. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Leitner C, Potenziani D. Mendeley reference manager. Mendeley. 2022. URL: https://www.mendeley.com/reference-management/reference-manager [accessed 2022-08-03]
  • Better systematic review management. Covidence. URL: https://www.covidence.org/ [accessed 2023-02-13]
  • Franco-Trigo L, Fernandez-Llimos F, Martínez-Martínez F, Benrimoj SI, Sabater-Hernández D. Stakeholder analysis in health innovation planning processes: A systematic scoping review. Health Policy. Oct 2020;124(10):1083-1099. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Walt G, Gilson L. Reforming the health sector in developing countries: the central role of policy analysis. Health Policy Plan. Dec 1994;9(4):353-370. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Digital health: a call for government leadership and cooperation between ICT and health. Broadband Commission. 2017. URL: https://broadbandcommission.org/wp-content/uploads/2021/09/WGHealth_Report2017-.pdf [accessed 2021-06-28]
  • Vota W. Every African country’s national eHealth strategy or digital health policy. ICT works. 2019. URL: https://www.ictworks.org/african-national-ehealth-strategy-policy/ [accessed 2023-12-10]
  • Moher D, Liberati A, Tetzlaff J, Altman DG, PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. PLoS Med. Jul 21, 2009;6(7):e1000097. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Standards and guidelines for electronic medical record systems in Kenya. Kenya Ministry of Medical Services, Kenya Ministry of Public Health and Sanitation. 2010. URL: http://guidelines.health.go.ke:8000/media/Standards_and_Guidelines_for_EMR_Systems.pdf [accessed 2023-04-21]
  • Kenya standards and guidelines for E-health systems interoperability. Kenya Ministry of Health, AfyaInfo Project. 2014. URL: https://pdf.usaid.gov/pdf_docs/PA00TB2K.pdf [accessed 2023-03-21]
  • Health sector ICT standards and guidelines. Kenya Ministry of Health. 2013. URL: https://www.medbox.org/pdf/5e148832db60a2044c2d2895 [accessed 2023-03-21]
  • Health information exchange standard operating procedure (SOP) and guideline. Nigeria Federal Ministry of Health. Jul 2020.
  • National eHealth strategy 2018-2022. Benin Ministry of Health. 2017.
  • The eHealth strategy of botswana 2020-2024. Botswana Ministry of Health. URL: https://ehealth.ub.bw/bhdc/Docs/MOH%20ehealth%20Strategy%20Book%20A4.pdf [accessed 2023-04-22]
  • Health sector digital strategy 2016-2020. Burkina Faso Ministry of Health.
  • National health informatics development plan of Burundi. Burundi Ministry of Public Health. 2015.
  • The 2020-2024 national digital health strategic plan. Cameroon Ministry of Public Health. 2020.
  • National eHealth strategy 2017-2021. Comoros Ministry of Health. 2016.
  • eHealth strategic plan. Cote d’Ivoire Minister of Health and Public Hygiene. 2011.
  • National development plan for health informatics. Democratic Republic of Congo Ministry of Public Health. 2014.
  • Kingdom of Swaziland eHealth strategy 2016 - 2020. Kingdom of Swaziland Ministry Of Health. 2016.
  • Information revolution strategic plan (2018-2025). Ethiopia Ministry of Health. 2018.
  • Strategic master plan of the health information system of the Gabon. Gabon Ministry of Public Health and Population. 2017.
  • National e-Health strategy. Ghana Ministry of Health. 2010.
  • Kenya national e-Health strategy. Kenya Ministry of Medical Services, Kenya Ministry of Public Health & Sanitation. 2011.
  • Kenya national eHealth policy 2016-2030. Kenya Ministry of Health. 2016.
  • National strategy - Liberia - 2016-2021. Liberia Ministry of Health. 2016.
  • Strategic plan for strengthening the health information system of Madagascar 2018–2022. Madagascar Ministry of Public Health. 2017.
  • National digital health strategy 2020-2025. Malawi Ministry of Health. 2020.
  • National eHealth policy in Mali. Mali Ministry of Health and Public Hygiene. 2013.
  • Health 2015: seamless continuity of care. Mauritius Ministry of Health and Quality of Life. 2015.
  • Strategic plan of information system for health 2009-2014. Mozambique Ministry of Health. 2009.
  • National eHealth strategy 2019-2023. Niger Ministry of Public Health. 2018.
  • National digital health strategy 2021-2025. Nigeria Federal Ministry of Health. 2021.
  • National digital health policy. Nigeria Federal Ministry of Health. 2021.
  • National digital health strategic plan 2018-2023. Rwanda Ministry of Health. 2018. URL: https://elearning.helinanet.org/mod/resource/view.php?id=890 [accessed 2023-05-09]
  • Strategic plan for digital health 2018-2023. Senegal Ministry of Health and Social Action. 2018.
  • National digital health strategy 2018-2023. Sierra Leone Ministry of Health and Sanitation, Sierra Leone Ministry of Information and Communication. 2018.
  • The national digital health strategy 2019 – 2024. Tanzania Ministry of Health, Community Development, Gender, Elderly and Children. 2019.
  • National digital health strategy for South Africa 2019 - 2024. South Africa Department of Health. 2019.
  • Strategic plan for the development of eHealth in Togo 2013-2015. Togo Ministry of Health. 2012.
  • Uganda national eHealth policy. Uganda Ministry of Health. 2016.
  • Uganda national eHealth strategy 2017 - 2021. Uganda Ministry of Health. URL: https://health.go.ug/sites/default/files/National%20e_Health%20Strategy_0.pdf [accessed 2023-05-16]
  • National eHealth strategy 2017-2021. Zambia Ministry of Health. 2017.
  • Zimbabwe’s E-Health strategy 2012-2017. Ministry of Health and Child Welfare. 2012.
  • National eHealth strategy 2021-2025. Namibia Ministry of Health & Social Services. 2021. URL: https://www.scribd.com/document/639371316/eHealth-Strategy-Namibia-2021# [accessed 2023-05-13]
  • Health sector ICT policy and strategy. Ghana Ministry of Health. 2005. URL: https://www.moh.gov.gh/wp-content/uploads/2016/02/Health-Sector-ICT-Policy-and-Strategy.pdf [accessed 2023-05-08]
  • Adebola OJ. Beyond national digital health strategy: final report of end term evaluation for the National Health ICT Strategic Framework 2015-2020. Nigeria Federal Ministry of Health. May 2021.
  • National Health ICT Strategic Framework 2015 - 2020. Nigeria Federal Ministry of Health. 2016. URL: https://www.health.gov.ng/doc/HealthICTStrategicFramework.pdf [accessed 2023-05-16]
  • Digital health blueprint. Ethiopia Ministry of Health. 2021. URL: http:/​/repository.​iifphc.org/​bitstream/​handle/​123456789/​1658/​Ethiopian-Digital-Health-Blueprint.​pdf?sequence=1&isAllowed=y [accessed 2023-05-16]
  • Kenya health information systems interoperability framework. Kenya Ministry of Health. 2020. URL: https:/​/www.​data4sdgs.org/​sites/​default/​files/​services_files/​Kenya%20Health%20Information%20Systems%20Interoperability%20Framework.​pdf [accessed 2023-05-16]
  • National community health digitization strategy 2020-2025. Kenya Ministry of Health, Division of Community Health Services. 2021. URL: https:/​/www.​eahealth.org/​sites/​www.eahealth.org/​files/​content/​attachments/​2021-08-02/​eCHIS-Strategy-2020-2025.​pdf [accessed 2023-05-16]
  • Leitner C, Potenziani D. Health information systems interoperability in Liberia. IntraHealth International. 2016. URL: https://elearning.helinanet.org/mod/resource/view.php?id=938 [accessed 2023-05-16]
  • Narrative for 2022 national digital health annual operational plan (AOP). Nigeria Federal Ministry of Health. 2022.
  • Tanzania digital health investment road map 2017-2023. Tanzania Ministry of Health, Community Development, Gender, Elderly and Children, President’s Office Regional Administration and Local Government. 2017.
  • Yu H. Regulation of digital health technologies in the European Union: intended versus actual use. In: Cohen GI, Minssen T, Price II NW, Robertson C, Shachar C, editors. The Future of Medical Device Regulation: Innovation and Protection. Cambridge. Cambridge University Press; Mar 31, 2022;103-114.
  • Policy for device software functions and mobile medical applications: guidance for industry and Food and Drug Administration staff. U.S. Food and Drug Administration. 2022. URL: https://www.fda.gov/media/80958/download [accessed 2023-10-10]
  • Gordon WJ, Landman A, Zhang H, Bates DW. Beyond validation: getting health apps into clinical practice. NPJ Digit Med. 2020;3:14. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • FDA clears mobile medical app to help those with opioid use disorder stay in recovery programs. U.S. Food and Drug Administration. 2018. URL: https:/​/www.​fda.gov/​news-events/​press-announcements/​fda-clears-mobile-medical-app-help-those-opioid-use-disorder-stay-recovery-programs [accessed 2021-01-27]
  • Digital maturity model: achieving digital maturity to drive growth. Deloitte. 2018. URL: https:/​/www2.​deloitte.com/​content/​dam/​Deloitte/​global/​Documents/​Technology-Media-Telecommunications/​deloitte-digital-maturity-model.​pdf [accessed 2021-10-20]
  • May E. How digital health apps are empowering patients. Deloitte. 2021. URL: https:/​/www2.​deloitte.com/​us/​en/​blog/​health-care-blog/​2021/​how-digital-health-apps-are-empowering-patients.​html [accessed 2023-10-06]
  • Guidance: criteria for health app assessment. Public Health England. 2017. URL: https:/​/www.​gov.uk/​government/​publications/​health-app-assessment-criteria/​criteria-for-health-app-assessment [accessed 2023-10-16]
  • Clinical risk management standards. National Health Service Digital. 2020. URL: https://digital.nhs.uk/services/clinical-safety/clinical-risk-management-standards [accessed 2023-10-28]
  • Report a problem with a medicine or medical device. Gov.uk. URL: https://www.gov.uk/report-problem-medicine-medical-device [accessed 2023-11-07]
  • Digital technology assessment criteria for health and social care (DTAC) - Version 1.0. National Health Service X. 2021. URL: https:/​/view.​officeapps.live.com/​op/​view.​aspx?src=https%3A%2F%2Ftransform.​england.​nhs.​uk%2Fmedia%2Fdocuments%2FDTAC_version_1.​0_FINAL_updated_16.​04.​odt&wdOrigin=BROWSELINK [accessed 2023-11-07]
  • Data protection impact assessment: NHS login - formerly Citizen Identity. National Health Service Digital. 2022. URL: https://digital.nhs.uk/services/nhs-login/data-protection-impact-assessment [accessed 2023-11-07]
  • Risks and data protection impact assessments (DPIAs). Information Commissioner’s Office. URL: https:/​/ico.​org.uk/​for-organisations/​uk-gdpr-guidance-and-resources/​accountability-and-governance/​accountability-framework/​risks-and-data-protection-impact-assessments-dpias/​ [accessed 2023-11-07]
  • A guide to good practice for digital and data-driven health technologies. Department of Health and Social Care. 2021. URL: https:/​/www.​gov.uk/​government/​publications/​code-of-conduct-for-data-driven-health-and-care-technology/​initial-code-of-conduct-for-data-driven-health-and-care-technology [accessed 2023-10-30]
  • Interoperability in healthcare. Healthcare Information and Management Systems Society (HIMSS). 2023. URL: https://www.himss.org/resources/interoperability-healthcare [accessed 2023-10-17]
  • DAPB4020: UK core Fast Healthcare Interoperability Resources (FHIR) release 4 (R4) governance. National Health Service Digital. 2022. URL: https:/​/digital.​nhs.uk/​data-and-information/​information-standards/​information-standards-and-data-collections-including-extractions/​publications-and-notifications/​standards-and-collections/​dapb4020-uk-core-fhir-r4-governance [accessed 2023-10-17]
  • Fast Healthcare Interoperability Resources (FHIR). National Health Service Digital. 2022. URL: https://digital.nhs.uk/services/fhir-apis [accessed 2023-10-17]
  • FHIR Interoperability Basics: 4 things to know. Health IT Analytics. 2022. URL: https://healthitanalytics.com/news/4-basics-to-know-about-the-role-of-fhir-in-interoperability [accessed 2023-11-07]
  • Giordanengo A, Bradway M, Pedersen R, Grøttland A, Hartvigsen G, Årsand E. Integrating data from apps, wearables and personal electronic health record (pEHR) systems with clinicians’ electronic health records (EHR) systems. Int J Integr Care. Nov 09, 2016;16(5):16. [ FREE Full text ] [ CrossRef ]
  • A plan for digital health and social care. Department of Health & Social Care. 2022. URL: https:/​/www.​gov.uk/​government/​publications/​a-plan-for-digital-health-and-social-care/​a-plan-for-digital-health-and-social-care [accessed 2022-12-01]
  • Ryu B, Kim N, Heo E, Yoo S, Lee K, Hwang H, et al. Impact of an electronic health record-integrated personal health record on patient participation in health care: development and randomized controlled trial of MyHealthKeeper. J Med Internet Res. Dec 07, 2017;19(12):e401. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Winter A, Takabayashi K, Jahn F, Kimura E, Engelbrecht R, Haux R, et al. Quality requirements for electronic health record systems*. A Japanese-German information management perspective. Methods Inf Med. Aug 07, 2017;56(7):e92-e104. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Wachter RM. Making IT work: harnessing the power of health information technology to improve care in England. National Advisory Group on Health Information Technology. 2016. URL: https:/​/assets.​publishing.service.gov.uk/​government/​uploads/​system/​uploads/​attachment_data/​file/​550866/​Wachter_Review_Accessible.​pdf [accessed 2021-07-22]
  • Framework on integrated people-centred health services (IPCHS). World Health Organisation. 2023. URL: https:/​/www.​who.int/​teams/​integrated-health-services/​clinical-services-and-systems/​service-organizations-and-integration [accessed 2023-06-05]
  • Ibeneme S, Karamagi H, Muneene D, Goswami K, Chisaka N, Okeibunor J. Strengthening health systems using innovative digital health technologies in Africa. Front Digit Health. 2022;4:854339. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Ibeneme S, Ukor N, Ongom M, Dasa T, Muneene D, Okeibunor J. Strengthening capacities among digital health leaders for the development and implementation of national digital health programs in Nigeria. BMC Proc. 2020;14(Suppl 10):9. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Delivering safe digital health. Organisation for the Review of Care and Health Apps. URL: https://orchahealth.com/ [accessed 2023-10-22]
  • Mamuye AL, Yilma TM, Abdulwahab A, Broomhead S, Zondo P, Kyeng M, et al. Health information exchange policy and standards for digital health systems in Africa: a systematic review. PLOS Digit Health. Oct 2022;1(10):e0000118. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Digital health platform handbook: Building a digital information infrastructure (infostructure) for health. World Health Organization, International Telecommunication Union. 2022. URL: https://www.itu.int/dms_pub/itu-d/opb/str/D-STR-E_HEALTH.10-2020-PDF-E.pdf [accessed 2021-05-22]
  • Framework for involving patients in patient safety 2021. National Health Service England. 2021. URL: https://www.england.nhs.uk/patient-safety/framework-for-involving-patients-in-patient-safety/ [accessed 2023-03-23]
  • Olesch A. Towards harmonised EU landscape for digital health: summary of the roundtable discussions in selected EIT Health InnoStars countries. EIT Health InnoStars. Jan 2023. URL: https://eithealth.eu/wp-content/uploads/2023/02/EIT_Health_DiGA_report_Jan2023.pdf [accessed 2023-10-10]
  • Grieb J, Tschammler D, Färber C, Woitz S. Digital health laws and regulations germany. Global Legal Group. 2023. URL: https://iclg.com/practice-areas/digital-health-laws-and-regulations/germany [accessed 2023-11-03]
  • Human regulatory: medical devices. European Medicines Agency. URL: https://www.ema.europa.eu/en/human-regulatory/overview/medical-devices [accessed 2023-10-12]
  • Strengthening the capacity for regulation of medical products in the African region. World Health Organization Regional Office for Africa. 2013. URL: https://iris.who.int/bitstream/handle/10665/94308/AFR_RC63_7.pdf?sequence=1 [accessed 2023-10-17]
  • Ncube BM, Dube A, Ward K. Establishment of the African Medicines Agency: progress, challenges and regulatory readiness. J Pharm Policy Pract. Mar 08, 2021;14(1):29. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Post market surveillance system. European Union Medical Device Regulation. 2023. URL: https://eumdr.com/post-market-surveillance-system/ [accessed 2023-10-31]
  • Dayal R. Effective post-market surveillance for medical devices: An essential part of medical devices regulation (MDR). Capgemini. 2020. URL: https:/​/www.​capgemini.com/​insights/​expert-perspectives/​effective-post-market-surveillance-for-medical-devices-an-essential-part-of-mdr/​ [accessed 2023-10-31]
  • NHS app library reaches 70 apps in honour of the NHS birthday. Northampton General Hospital NHS Trust. 2018. URL: https:/​/www.​northamptongeneral.nhs.uk/​News/​News-Archive/​2018/​NHS-App-Library-reaches-70-apps-in-honour-of-the-NHS-birthday.​aspx [accessed 2023-09-21]
  • Developers invited to add to NHS apps library. National Health Service Digital. 2018. URL: https://digital.nhs.uk/news/2018/developers-invited-to-add-to-nhs-apps-library [accessed 2023-09-22]
  • The NHS apps library has closed. National Health Service Digital. 2021. URL: https:/​/digital.​nhs.uk/​services/​nhs-apps-library#:~:text=The%20NHS%20Apps%20Library%20was%20decommissioned%20in%20December%202021.​&text=Further%20information%20can%20be%20found%20on%20the%20NHS.​UK%20website [accessed 2023-09-22]
  • Larsen ME, Huckvale K, Nicholas J, Torous J, Birrell L, Li E, et al. Using science to sell apps: evaluation of mental health app store quality claims. NPJ Digit Med. 2019;2:18. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • About the NHS app. National Health Service. Dec 4, 2023. URL: https://www.nhs.uk/nhs-app/about-the-nhs-app/ [accessed 2023-09-22]
  • Scope of registration: regulated activities. Care Quality Commission. 2022. URL: https://www.cqc.org.uk/guidance-providers/scope-registration-regulated-activities [accessed 2023-11-05]
  • Distributing great apps into health and care services across the world. Organisation for the Review of Care and Health Apps. 2020. URL: https://orchahealth.com/wp-content/uploads/2020/12/Health-and-Care-1.pdf [accessed 2023-10-09]
  • Our founder, our story and our values: we exist to make digital health healthy. Organisation for the Review of Care and Health Apps. URL: https://orchahealth.com/about-us/ [accessed 2023-10-09]
  • Health app library: empower your community with safe access to health apps and digital health products. Organisation for the Review of Care and Health Apps. URL: https:/​/orchahealth.​com/​our-products/​health-app-library/​#:~:text=A%20Health%20App%20Library%20is,on%20the%20Health%20App%20Library [accessed 2023-10-09]
  • Digital technology assessment criteria (DTAC). National Health Service X. URL: https://www.nhsx.nhs.uk/key-tools-and-info/digital-technology-assessment-criteria-dtac/ [accessed 2023-10-09]
  • Evidence standards framework (ESF) for digital health technologies. National Institute for Health and Care Excellence. 2023. URL: https:/​/www.​nice.org.uk/​about/​what-we-do/​our-programmes/​evidence-standards-framework-for-digital-health-technologies [accessed 2023-10-08]
  • Tsang L, Kerr-Peterson H. UK NICE updates its evidence standards framework for data-driven digital health technologies. Ropes & Gray. 2022. URL: https:/​/www.​ropesgray.com/​en/​insights/​alerts/​2022/​10/​uk-nice-updates-its-evidence-standards-framework-for-data-driven-digital-health-technologies [accessed 2023-10-09]
  • Guidance: medical device stand-alone software including apps (including IVDMDs). Medicines and healthcare products regulatory agency. 2023. URL: https:/​/assets.​publishing.service.gov.uk/​government/​uploads/​system/​uploads/​attachment_data/​file/​1168485/​Medical_device_stand-alone_software_including_apps__including_IVDMDs_.​pdf [accessed 2023-10-09]
  • Digital therapeutics in the United Kingdom. Digital Therapeutics Alliance. 2021. URL: https://dtxalliance.org/wp-content/uploads/2021/06/DTA_DTx-Overview_UK.pdf [accessed 2023-10-09]
  • Transforming global healthcare by advancing digital therapeutics. Digital Therapeutics Alliance. 2023. URL: https://dtxalliance.org/ [accessed 2023-10-10]
  • International Organization for Standardization (ISO) digital therapeutic definition. Digital Therapeutic Alliance. Jun 2023. URL: https://dtxalliance.org/wp-content/uploads/2023/06/DTA_FS_ISO-Definition.pdf [accessed 2023-10-09]
  • Ndomondo-Sigonda M, Miot J, Naidoo S, Dodoo A, Kaale E. Medicines regulation in Africa: current state and opportunities. Pharmaceut Med. 2017;31(6):383-397. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chinele J. East Africa shows solid support for African Medicines Agency treaty. Health Policy Watch. Aug 16, 2023. URL: https://healthpolicy-watch.news/east-africa-shows-solid-support-for-african-medicines-agency-treaty/ [accessed 2023-10-09]
  • Treaty for the establishment of the African Medicines Agency 2019. African Union. 2019. URL: https://au.int/sites/default/files/treaties/36892-treaty-0069_-_ama_treaty_e.pdf [accessed 2023-10-17]
  • European Medicines Agency: about us. European Medicines Agency. Mar 1, 2023. URL: https://www.ema.europa.eu/en/documents/other/about-us-european-medicines-agency-ema_en.pdf [accessed 2023-10-18]
  • Bryson JM. What to do when stakeholders matter. Public Adm Rev. Mar 2004;6(1):21-53. [ FREE Full text ] [ CrossRef ]
  • Iyawa G, Herselman M, Botha A. Potential stakeholders and perceived benefits of a digital health innovation ecosystem for the Namibian context. Procedia Computer Science. 2017;121:431-438. [ CrossRef ]
  • Ferretti V. From stakeholders analysis to cognitive mapping and multi-attribute value theory: an integrated approach for policy support. European Journal of Operational Research. Sep 2016;253(2):524-541. [ CrossRef ]
  • Brugha R, Varvasovszky Z. Stakeholder analysis: a review. Health Policy Plan. Sep 2000;15(3):239-246. [ CrossRef ] [ Medline ]
  • Schmeer K. Guidelines for conducting a stakeholder analysis 1999. Partnerships for Health Reform, Abt Associates. 1999. URL: https://www.ktecop.ca/wordpress/wp-content/uploads/guidelines-stakeholder-analysis-PHR-1999.pdf [accessed 2023-10-17]
  • Gilmour J, Beilin R. Stakeholder mapping for effective risk assessment and communication. Australian Centre of Excellence for Risk Analysis, University of Melbourne. Apr 2007. URL: https://cebra.unimelb.edu.au/__data/assets/pdf_file/0006/2220990/gilmour0609.pdf [accessed 2023-10-17]
  • Quality, service improvement and redesign tools: stakeholder analysis. National Health Service England, National Health Service Improvement. 2022. URL: https://www.england.nhs.uk/wp-content/uploads/2022/02/qsir-stakeholder-analysis.pdf [accessed 2023-10-20]
  • Craven MP, Lang AR, Martin JL. Developing mHealth apps with researchers: multi-stakeholder design considerations. Springer; 2014. Presented at: Third International Conference, DUXU 2014, held as a part of HCI International; June 22-27, 2014;15-24; Heraklion, Greece. URL: https://doi.org/10.1007/978-3-319-07635-5_2 [ CrossRef ]
  • How to encourage stakeholder participation. SustaiNet Software International. URL: https://sustainet.com/how-to-encourage-stakeholder-participation/ [accessed 2023-10-20]
  • Stakeholder engagement. Organisation for Economic Cooperation and Development. URL: https:/​/www.​oecd.org/​governance/​better-international-rulemaking/​compendium/​keyprinciples/​stakeholderengagement.​htm [accessed 2023-10-20]
  • Microsoft Copilot in Bing. Microsoft. URL: https://www.bing.com/chat?form=NTPCHB [accessed 2023-03-15]

Abbreviations

Edited by A Mavragani; submitted 19.05.23; peer-reviewed by N O'Brien, A Essén; comments to author 07.09.23; revised version received 08.12.23; accepted 23.02.24; published 11.04.24.

©Benard Ayaka Bene, Sunny Ibeneme, Kayode Philip Fadahunsi, Bala Isa Harri, Nkiruka Ukor, Nikolaos Mastellos, Azeem Majeed, Josip Car. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 11.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

database security research paper

  Lwati: A Journal of Contemporary Research Journal / Lwati: A Journal of Contemporary Research / Vol. 21 No. 1 (2024) / Articles (function() { function async_load(){ var s = document.createElement('script'); s.type = 'text/javascript'; s.async = true; var theUrl = 'https://www.journalquality.info/journalquality/ratings/2404-www-ajol-info-lwati'; s.src = theUrl + ( theUrl.indexOf("?") >= 0 ? "&" : "?") + 'ref=' + encodeURIComponent(window.location.href); var embedder = document.getElementById('jpps-embedder-ajol-lwati'); embedder.parentNode.insertBefore(s, embedder); } if (window.attachEvent) window.attachEvent('onload', async_load); else window.addEventListener('load', async_load, false); })();  

Article sidebar.

Open Access

Article Details

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License .

Copyright for articles published in this journal is retained by the journal.

The journal content is distributed under the terms of the Creative Commons License [CC BY-NC-ND 4.0] http://creativecommons.org/licenses/by-nc-nd/4.0

Main Article Content

Globalization and the dynamics of national security in the 21st century, sunday adejoh.

Globalization and the increasing interconnectivity of the international system has brought about changes in the nature of state relations and newer trends in security threats within the international system. Security threats that hitherto did not exist are now made manifest with the aid of globalization particularly as a result of advancement in information, communication and transportation technologies. Cyber crime, transnational crimes, money laundry, human trafficking, terrorism financing, proliferation of small arms and light weapons amongst several other security threats now characterize the global system. This paper therefore attempts to interrogate globalization as a driver for insecurity. It is the position of this paper that national and international security dynamics have changes because of globalization. It is desk research and relies basically on secondary data. The paper therefore recommends the need for states to adopt security strategies that are in line with global trends so as to address security challenges. It also recommends the application of technology in addressing national security challenges.

AJOL is a Non Profit Organisation that cannot function without donations. AJOL and the millions of African and international researchers who rely on our free services are deeply grateful for your contribution. AJOL is annually audited and was also independently assessed in 2019 by E&Y.

Your donation is guaranteed to directly contribute to Africans sharing their research output with a global readership.

  • For annual AJOL Supporter contributions, please view our Supporters page.

Journal Identifiers

database security research paper

A survey on security challenges in cloud computing: issues, threats, and solutions

  • Published: 28 February 2020
  • Volume 76 , pages 9493–9532, ( 2020 )

Cite this article

  • Hamed Tabrizchi 1 &
  • Marjan Kuchaki Rafsanjani   ORCID: orcid.org/0000-0002-3220-4839 1  

15k Accesses

227 Citations

3 Altmetric

Explore all metrics

Cloud computing has gained huge attention over the past decades because of continuously increasing demands. There are several advantages to organizations moving toward cloud-based data storage solutions. These include simplified IT infrastructure and management, remote access from effectively anywhere in the world with a stable Internet connection and the cost efficiencies that cloud computing can bring. The associated security and privacy challenges in cloud require further exploration. Researchers from academia, industry, and standards organizations have provided potential solutions to these challenges in the previously published studies. The narrative review presented in this survey provides cloud security issues and requirements, identified threats, and known vulnerabilities. In fact, this work aims to analyze the different components of cloud computing as well as present security and privacy problems that these systems face. Moreover, this work presents new classification of recent security solutions that exist in this area. Additionally, this survey introduced various types of security threats which are threatening cloud computing services and also discussed open issues and propose future directions. This paper will focus and explore a detailed knowledge about the security challenges that are faced by cloud entities such as cloud service provider, the data owner, and cloud user.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

database security research paper

Similar content being viewed by others

database security research paper

Cloud Computing Security Challenges: A Review

database security research paper

A Survey on Data Security Challenges and Their Solutions in Cloud Computing

database security research paper

Cloud Computing—Security, Issues, and Solutions

Subramanian N, Jeyaraj A (2018) Recent security challenges in cloud computing. Comput Electr Eng 71:28–42

Google Scholar  

Mell P, Grance T (2018) SP 800-145, The NIST Definition of cloud computing | CSRC (online) Csrc.nist.gov. https://csrc.nist.gov/publications/detail/sp/800-145/final . Accessed 11 Dec 2018

Xu X (2012) From cloud computing to cloud manufacturing. Robot Comput Integr Manuf 28(1):75–86

Pippal SK, Kushwaha DS (2013) A simple, adaptable and efficient heterogeneous multi-tenant database architecture for ad hoc cloud. J Cloud Comput Adv Syst Appl 2(1):5

Shi B, Cui L, Li B, Liu X, Hao Z, Shen H (2018) Shadow monitor: an effective in-VM monitoring framework with hardware-enforced isolation. In: International Symposium on Research in Attacks, Intrusions, and Defenses. Springer, Berlin, pp 670–690

Bhamare D, Samaka M, Erbad A, Jain R, Gupta L, Chan HA (2017) Optimal virtual network function placement in multi-cloud service function chaining architecture. Comput Commun 102:1–16

Alzahrani A, Alalwan N, Sarrab M (2014) Mobile cloud computing. In: Proceedings of the 7th Euro American Conference on Telematics and Information Systems (EATIS’14)

Deka GC, Das PK (2018) Application of virtualization technology in IaaS cloud deployment model. In: Design and Use of Virtualization Technology in Cloud Computing: IGI Global, pp 29–99

Oracle.com (2018) The Oracle and KPMG Cloud Threat Report 2018 | Oracle (online). https://www.oracle.com/cloud/cloud-threat-report.html . Accessed 11 Dec 2018

Hashem IAT, Yaqoob I, Anuar NB, Mokhtar S, Gani A, Khan SU (2015) The rise of “big data” on cloud computing: review and open research issues. Inf Syst 47:98–115

Roman R, Lopez J, Mambo M (2018) Mobile edge computing, fog et al.: a survey and analysis of security threats and challenges. Future Gener Comput Syst 78:680–698

Ramachandra G, Iftikhar M, Khan FA (2017) A comprehensive survey on security in cloud computing. Proc Comput Sci 110:465–472

Csrc.nist.gov (2018) SP 500-299 (DRAFT), NIST Cloud Computing Security Reference Architecture | CSRC (online). https://csrc.nist.gov/publications/detail/sp/500-299/draft . Accessed 11 Sept 2018

Yu S, Wang C, Ren K, Lou W (Mar 2010) Achieving secure, scalable, and fine-grained data access control in cloud computing. In: Proceedings of the IEEE INFOCOM

Sgandurra D, Lupu E (2016) Evolution of attacks, threat models, and solutions for virtualized systems. ACM Comput Surv 48(3):1–38

Kaur M, Singh H (2015) A review of cloud computing security issues. Int J Adv Eng Technol 8(3):397–403

Kumar PR, Raj PH, Jelciana P (2018) Exploring data security issues and solutions in cloud computing. Proc Comput Sci 125:691–697

Khalil I, Khreishah A, Azeem M (2014) Cloud computing security: a survey. Computers 3(1):1–35

Bashir SF, Haider S (Dec 2011) Security threats in cloud computing. In: Proceedings of the International Conference for Internet Technology and Secured Transactions, pp 214–219

Ryan MD (2013) Cloud computing security: the scientific challenge, and a survey of solutions. J Syst Softw 86(9):2263–2268

Wang C, Wang Q, Ren K, Lou W (Mar 2010) Privacy-preserving public auditing for data storage security in cloud computing. In: Proceedings of the IEEE INFOCOM

Singh S, Jeong Y-S, Park JH (2016) A survey on cloud computing security: issues, threats, and solutions. J Netw Comput Appl 75:200–222

Khalil IM, Khreishah A, Azeem M (2014) Cloud computing security: a survey. Computers 3(1):1–35

Ahmed M, Litchfield AT (2018) Taxonomy for identification of security issues in cloud computing environments. J Comput Inf Syst 58(1):79–88

Fotiou N, Machas A, Polyzos GC, Xylomenos G (2015) Access control as a service for the Cloud. J Internet Serv Appl 6(1):11

Sumitra B, Pethuru C, Misbahuddin M (2014) A survey of cloud authentication attacks and solution approaches. Int J Innov Res Comput Commun Eng 2(10):6245–6253

Fernandes DA, Soares LF, Gomes JV, Freire MM, Inácio PR (2014) Security issues in cloud environments: a survey. Int J Inf Secur 13(2):113–170

Subashini S, Kavitha V (2011) A survey on security issues in service delivery models of cloud computing. J Netw Comput Appl 34(1):1–11

Zhang Y, Chen X, Li J, Wong DS, Li H, You I (2017) Ensuring attribute privacy protection and fast decryption for outsourced data security in mobile cloud computing. Inf Sci 379:42–61

MATH   Google Scholar  

Abbas H, Maennel O, Assar S (2017) Security and privacy issues in cloud computing. Springer, Berlin

TechRepublic (2018) Building Trust in a Cloudy Sky (online). https://www.techrepublic.com/resource-library/whitepapers/building-trust-in-a-cloudy-sky/ . Accessed 11 Sept 2018

Basu S et al (2018) Cloud computing security challenges and solutions—a survey. In: Proceedings of the IEEE 8th Annual on Computing and Communication Workshop and Conference (CCWC), pp 347–356

Dzombeta S, Stantchev V, Colomo-Palacios R, Brandis K, Haufe K (2014) Governance of cloud computing services for the life sciences. IT Prof 16(4):30–37

Butun I, Erol-Kantarci M, Kantarci B, Song H (2016) Cloud-centric multi-level authentication as a service for secure public safety device networks. IEEE Commun Mag 54(4):47–53

Saevanee H, Clarke N, Furnell S, Biscione V (2015) Continuous user authentication using multi-modal biometrics. Comput Secur 53:234–246

Khalil I, Khreishah A, Azeem M (2014) Consolidated identity management system for secure mobile cloud computing. Comput Netw 65:99–110

Faber T, Schwab S, Wroclawski J (2016) Authorization and access control: ABAC. In: McGeer R, Berman M, Elliott C, Ricci R (eds) The GENI book. Springer, Berlin, pp 203–234

Khan MA (2016) A survey of security issues for cloud computing. J Netw Comput Appl 71:11–29

Cai F, Zhu N, He J, Mu P, Li W, Yu Y (2018) Survey of access control models and technologies for cloud computing. Clust Comput 22(S3):6111–6122

Joshi MP, Joshi KP, Finin T (2018) Attribute based encryption for secure access to cloud based EHR systems. In: Proceedings of the International Conference on Cloud Computing

Indu I, Anand PR, Bhaskar V (2018) Identity and access management in cloud environment: mechanisms and challenges. Eng Sci Technol Int J 21(4):574–588

Mohit P, Biswas G (2017) Confidentiality and storage of data in cloud environment. In: Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications. Springer, Berlin, pp 289–295

Khan SI, Hoque ASL (2016) Privacy and security problems of national health data warehouse: a convenient solution for developing countries. In: Proceedings of the IEEE International Conference on Networking Systems and Security (NSysS), pp 1–6

Tang J, Cui Y, Li Q, Ren K, Liu J, Buyya R (2016) Ensuring security and privacy preservation for cloud data services. ACM Comput Surv (CSUR) 49(1):13

Islam MA, Vrbsky SV (2017) Transaction management with tree-based consistency in cloud databases. Int J Cloud Comput 6(1):58–78

Ku C-Y, Chiu Y-S (2013) A novel infrastructure for data sanitization in cloud computing. In: Diversity, Technology, and Innovation for Operational Competitiveness: Proceedings of the 2013 International Conference on Technology Innovation and Industrial Management, pp 3–25

Singh HJ, Bawa S (2018) Scalable metadata management techniques for ultra-large distributed storage systems—a systematic review. ACM Comput Surv (CSUR) 51(4):82

Sehgal NK, Bhatt PCP (2018) Cloud computing concepts and practics. Springer

Prokhorenko V, Choo K-KR, Ashman H (2016) Web application protection techniques: a taxonomy. J Netw Comput Appl 60:95–112

Shin S et al (2014) Rosemary: a robust, secure, and high-performance network operating system. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, New York, pp 78–89

Somani G, Gaur MS, Sanghi D, Conti M, Buyya R (2017) DDoS attacks in cloud computing: issues, taxonomy, and future directions. Comput Commun 107:30–48

Sattar K, Salah K, Sqalli M, Rafiq R, Rizwan M (2017) A delay-based countermeasure against the discovery of default rules in firewalls. Arab J Sci Eng 42(2):833–844

Iqbal S, Kiah ML, Dhaghighi B, Hussain M, Khan S, Khan MK, Choo KKR (2016) On cloud security attacks: a taxonomy and intrusion detection and prevention as a service. J Netw Comput Appl 74:98–120

Mishra P, Pilli ES, Varadharajan V, Tupakula U (2017) Intrusion detection techniques in cloud environment: a survey. J Netw Comput Appl 77:18–47

Kohnfelder L, Garg P (1999) The threats to our products. Microsoft Interface, Microsoft Corporation, New York, p 33

Tounsi W, Rais HJC (2018) A survey on technical threat intelligence in the age of sophisticated cyber attacks. Comput Secur 72:212–233

Meinig M, Sukmana MI, Torkura KA, Meinel CJPCS (2019) Holistic strategy-based threat model for organizations. Proc Comput Sci 151:100–107

Mokhtar B, Azab MJAEJ (2015) Survey on security issues in vehicular ad hoc networks. Alex Eng J 54(4):1115–1126

Tan Y, Wu F, Wu Q, Liao XJTJOS (2019) Resource stealing: a resource multiplexing method for mix workloads in cloud system. J Supercomput 75(1):33–49

Hong JB, Nhlabatsi A, Kim DS, Hussein A, Fetais N, Khan KMJCN (2019) Systematic identification of threats in the cloud: a survey. Comput Netw 150:46–69

Haber MJ, Hibbert B (2018) Asset attack vectors. Apress, Berkeley, CA

Rai S, Sharma K, Dhakal D (2019) A survey on detection and mitigation of distributed denial-of-service attack in named data networking. In: Sarma H, Borah S, Dutta N (eds) Advances in communication, cloud, and big data. Lecture notes in networks and systems, vol 31. Springer, Singapore

Bojović P, Bašičević I, Ocovaj S, Popović M (2019) A practical approach to detection of distributed denial-of-service attacks using a hybrid detection method. Comput Electr Eng 73:84–96

Eldewahi AE, Hassan A, Elbadawi K, Barry BI (2018) The analysis of MATE attack in SDN based on STRIDE model. In: Proceedings of the International Conference on Emerging Internetworking, Data and Web Technologies, pp 901–910

Tuma K, Scandariato R (2018) Two architectural threat analysis techniques compared. In: Proceedings of the European Conference on Software Architecture. Springer, Berlin, pp 347–363

Symantec.com (2019) Cloud Security Threat Report (CSTR) 2019 | Symantec (online). https://www.symantec.com/security-center/cloud-security-threat-report . Accessed 19 July 2019

Akshaya MS, Padmavathi G (2019) Taxonomy of security attacks and risk assessment of cloud computing. In: Peter J, Alavi A, Javadi B (eds) Advances in big data and cloud computing. Advances in intelligent systems and computing, vol 750. Springer, Singapore

Subramanian N, Jeyaraj AJC, Engineering E (2018) Recent security challenges in cloud computing. Comput Electr Eng 71:28–42

Tan CB, Hijazi MHA, Lim Y, Gani A (2018) A survey on proof of retrievability for cloud data integrity and availability: cloud storage state-of-the-art, issues, solutions and future trends. J Netw Comput Appl 110:75–86

Ghafir I, Jibran S, Mohammad H, Hanan F, Vaclav P, Sardar J, Sohail J, Thar B (2018) Security threats to critical infrastructure: the human factor. J Supercomput 74(10):4986–5002

Yamin MM, Katt B, Sattar K, Ahmad MB (2019) Implementation of insider threat detection system using honeypot based sensors and threat analytics. In: Future of Information and Communication Conference. Springer, Berlin, pp 801–829

Osanaiye O, Choo K-KR, Dlodlo MJJON (2016) Distributed denial of service (DDoS) resilience in cloud: review and conceptual cloud DDoS mitigation framework. J Netw Comput Appl 67:147–165

Alsmadi I (2019) Incident response. In: The NICE Cyber Security Framework, pp 331–346

Fernandes G, Rodrigues JJPC, Carvalho LF, Al-Muhtadi JF, Proença ML (2018) A comprehensive survey on network anomaly detection. Telecommun Syst 70(3):447–489

Nashimoto S, Homma N, Hayashi Y, Takahashi J, Fuji H, Aoki T (2016) Buffer overflow attack with multiple fault injection and a proven countermeasure. J Cryptogr Eng 7(1):35–46

Chen Z, Han H (2017) Attack mitigation by data structure randomization. In: Cuppens F, Wang L, Cuppens-Boulahia N, Tawbi N, Garcia-Alfaro J (eds) Foundations and practice of security. FPS 2016. Lecture notes in computer science, vol 10128. Springer, Cham

Cohen A, Nissim N, Rokach L, Elovici Y (2016) SFEM: structural feature extraction methodology for the detection of malicious office documents using machine learning methods. Expert Syst Appl 63:324–343

Sangeetha R (Feb 2013) Detection of malicious code in user mode. In: Proceedings of the International Conference on Information Communication and Embedded Systems (ICICES)

Lichtman M, Poston JD, Amuru S, Shahriar C, Clancy TC, Buehrer RM, Reed JH (2016) A communications jamming taxonomy. IEEE Secur Priv 14(1):47–54

Wu M, Moon YB (2017) Taxonomy of cross-domain attacks on cyber manufacturing system. Proc Comput Sci 114:367–374

Bhagwani H, Negi R, Dutta AK, Handa A, Kumar N, Shukla SK (2019) Automated classification of web-application attacks for intrusion detection. In: Lecture notes in computer science, pp 123–141

Chen M-S, Park JS, Yu PS (1996) Data mining for path traversal patterns in a web environment. In: Proceedings of 16th International Conference on Distributed Computing Systems, pp 385–392

Murugan K, Suresh P (2018) Efficient anomaly intrusion detection using hybrid probabilistic techniques in wireless ad hoc network. Int J Netw Secur 20(4):730–737

Ghose N, Lazos L, Li M (2018) Secure device bootstrapping without secrets resistant to signal manipulation attacks. In: Proceedings of the IEEE Symposium on Security and Privacy (SP), pp 819–835

Osanaiye O, Choo K-KR, Dlodlo M (2016) Distributed denial of service (DDoS) resilience in cloud: review and conceptual cloud DDoS mitigation framework. J Netw Comput Appl 67:147–165

Zhang X, Zhang Y, Mo Q, Xia H, Yang Z, Yang M, Wang X, Lunand L, Duan H (2018) An empirical study of web resource manipulation in real-world mobile applications. In: Proceedings of the 27th Security Symposium (Security 18), pp 1183–1198

Coppolino L, D’Antonio S, Mazzeo G, Romano L (2017) Cloud security: emerging threats and current solutions. Comput Electr Eng 59:126–140

Gumaei A, Sammouda R, Al-Salman AMS, Alsanad A (2019) Anti-spoofing cloud-based multi-spectral biometric identification system for enterprise security and privacy-preservation. J Parallel Distrib Comput 124:27–40

Vlajic N, Chowdhury M, Litoiu M (2019) IP Spoofing in and out of the public cloud: from policy to practice. Computers 8(4):81

Download references

Author information

Authors and affiliations.

Department of Computer Science, Faculty of Mathematics and Computer, Shahid Bahonar University of Kerman, Kerman, Iran

Hamed Tabrizchi & Marjan Kuchaki Rafsanjani

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Marjan Kuchaki Rafsanjani .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Tabrizchi, H., Kuchaki Rafsanjani, M. A survey on security challenges in cloud computing: issues, threats, and solutions. J Supercomput 76 , 9493–9532 (2020). https://doi.org/10.1007/s11227-020-03213-1

Download citation

Published : 28 February 2020

Issue Date : December 2020

DOI : https://doi.org/10.1007/s11227-020-03213-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cloud computing
  • Vulnerabilities
  • Data protection
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. (PDF) Security Of Database Management Systems

    database security research paper

  2. 😍 Database security research paper. Database security research paper

    database security research paper

  3. Bryson Database Security Paper.docx

    database security research paper

  4. (PDF) Cybersecurity Issues in AI

    database security research paper

  5. Example Of Cyber Security Research Paper

    database security research paper

  6. (PDF) Data Security

    database security research paper

VIDEO

  1. 10.2. DataBase Security| Protection requirement |Security Controls|Information security |Darakhshan

  2. DBMS

  3. Security techniques for cross database access

  4. What are Database Security? Its Threats and Best Practices

  5. Advanced Topics in Information Security database security part 2

  6. Advanced Topics in Information Security Database Security part 3

COMMENTS

  1. Database Security Threats and Challenges

    Most database security features have to be developed to secure the database environment. The aim of the paper is to underline the types of threats and challenges and their impact on sensitive data and to present different safety models. The assumption underpinning this study is that it understands the weaknesses, threats and challenges faced by ...

  2. Database Security: An Overview and Analysis of Current Trend

    This paper talks about the basics of database including its meaning, characteristics, role etc. with special focus on different security challenges in the database. Moreover, this paper highlights the basics of security management, tools in this regard. Hence different areas of database security have mentioned in this paper in a simple sense.

  3. 2425 PDFs

    Explore the latest full-text research PDFs, articles, conference papers, preprints and more on DATABASE SECURITY. Find methods information, sources, references or conduct a literature review on ...

  4. Advancing database security: a comprehensive systematic ...

    This SMS study aimed to identify the most up-to-date research in database security and the different challenges faced by users/clients using various databases from a software engineering perspective. In total, 20 challenges were identified related to database security. ... 1.2 Motivation for the paper. Several research in the literature seeks ...

  5. Security Analysis, Threats, & Challenges in Database

    assesses existing explorations and research challenge s on this specific area. Keywords: DBMS, Threats, CIA, AES, Security. 1. INTRODUCTION. In a database there may be valuable and sensitive data ...

  6. Database Security and Encryption: A Survey Study

    In this paper, the aim is to offer a refreshed perspective of the security measures implemented in databases nowadays, with a comparison study of two popular databases: Oracle and Microsoft SQL.

  7. Database security

    In this paper, we first survey the most relevant concepts underlying the notion of database security and summarize the most well-known techniques. We focus on access control systems, on which a large body of research has been devoted, and describe the key access control models, namely, the discretionary and mandatory access control models, and ...

  8. Database Security: Attacks and Solutions

    This research paper coheres databases and its security in any organization. Issues of unauthorized access, deception, vulnerability, authentication and fabrication has been discussed along with the solutions to these attacks. ... key management system and comprehensive protection will positively impact and would tell the importance and delicacy ...

  9. Data security governance in the era of big data: status, challenges

    Global status of data security governance. Countries and economic communities across the globe have devised countermeasures to cope with emerging big data security issues, and prepare for upcoming problems through enhancing data security governance. 1.1. Stepping up legislative efforts in protecting personal data.

  10. Database security: Research and practice

    In this paper, we survey the state of the art in access control for database systems, discuss the main research issues, and outline possible directions for future research. ... Database Security: Research and Practice 545 Name ^ame Department /-i -Department Salary Salary TC Bob Low Depti Low 10K Low Low Ann High Dept2 High 20K High High Sam ...

  11. Data Security as a Top Priority in the Digital World: Preserve Data

    The aim of this paper is to examine both current data security research and to analyse whether "traditional" vulnerability registries provide a sufficient insight on DBMS security, or they should be rather inspected by using IoTSE-based and respective passive testing, or dynamically inspected by DBMS holders conducting an active testing ...

  12. database security Latest Research Papers

    Cryptographic technique is an alternative solution that can be used in database security. One way to maintain the security of the database is to use encryption techniques. The method used to secure the database is encryption using the ROTI3 and Caesar Cipher methods. Both of these methods have advantages in processing speed.

  13. Data Security and Privacy in Cloud Computing

    In this paper, we will review different security techniques and challenges for data storage security and privacy protection in the cloud computing environment. As Figure 1 shows, this paper presents a comparative research analysis of the existing research work regarding the techniques used in the cloud computing through data security aspects ...

  14. Database security threats: A survey study

    To secure a database environment, many database security models need to be developed. The purpose of the paper is to highlight and threat types and their impacts on sensitive data, and presents different security models. The assumption underlying this study is that by understanding the weaknesses and the threats facing databases, database ...

  15. Editorial: Introduction to Data Security and Privacy

    The paper, after providing a comprehensive set of system requirements toward addressing such problem, presents formal methods for the verification of security policies specified for the integrated data. This paper is an excellent reference for anyone interested in exploring data security in the context of data integration systems. J.

  16. Cybersecurity data science: an overview from machine learning

    In a computing context, cybersecurity is undergoing massive shifts in technology and its operations in recent days, and data science is driving the change. Extracting security incident patterns or insights from cybersecurity data and building corresponding data-driven model, is the key to make a security system automated and intelligent. To understand and analyze the actual phenomena with data ...

  17. The Impact of Artificial Intelligence on Data System Security: A

    This paper aims at identifying research trends in the field through a systematic bibliometric literature review (LRSB) of research on AI and system security. the review entails 77 articles published in the Scopus ® database, presenting up-to-date knowledge on the topic. the LRSB results were synthesized across current research subthemes ...

  18. Journal of Cybersecurity and Privacy

    A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications. Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the ...

  19. Full article: Cybersecurity Deep: Approaches, Attacks Dataset, and

    Various security organizations worldwide continue to develop innovative techniques to protect peripherals and sensitive data from cyberattacks. Broad security practices include ... The remaining paper is structured as follows. ... Most research results were proposed using the public database. The research should highlight building a real-time ...

  20. Cyber risk and cybersecurity: a systematic review of data availability

    Finally, this research paper highlights the need for open access to cyber-specific data, without price or permission barriers. ... Moreno et al. developed a database of 300 security-related accidents from European and American sources. The database contained cybersecurity-related events in the chemical and process industry.

  21. Journal of Medical Internet Research

    Background: Health apps are increasingly recognized as crucial tools for enhancing health care delivery. Many countries, particularly those in sub-Saharan Africa, can substantially benefit from using health apps to support self-management and thus help to achieve universal health coverage and the third sustainable development goal. However, most health apps published in app stores are of ...

  22. Globalization and the dynamics of national security in the 21st century

    It is the position of this paper that national and international security dynamics have changes because of globalization. It is desk research and relies basically on secondary data. The paper therefore recommends the need for states to adopt security strategies that are in line with global trends so as to address security challenges.

  23. A survey on security challenges in cloud computing: issues, threats

    In fact, this paper completely focuses on data security issues and also presents methods to protect the data and its privacy. Khalil et al. ... In fact, this research attempted to show various security challenges, vulnerabilities, attacks, and threats that hamper the adoption of cloud computing. Our paper provided a survey on cloud security ...