Sunday, February 11, 2024

ABOUT - Dr Prafulla Dikshit

 

Dr Prafulla Dikshit is a Research and Statistics Consultant. In his career spanning almost 25 years, he has worked in various management positions across several industries including Healthcare, Hospitality and Management Consulting. With avid research interest – he subsequently branched off into research and academics, and pursued a PhD in Marketing Management, while honing his research skills and teaching Marketing, Hospitality, Tourism, and Quantitative Techniques as a Professor of Management at various reputable institutions. Dr. Dikshit also successfully completed a course in Statistics at the prestigious Stanford University, USA.

Dr Dikshit is a numbers person with a penchant for statistical data analytics and is proficient in major statistical software such as R, Stata, SAS, and SPSS. Armed with a doctorate, and vast industry, academic, and research experience under his belt, Dr. Dikshit has been in the research and statistics consultant role for several years and completed more than 400 industry and academic research projects within the Hospitality, Finance, Marketing, and Strategic Management domains. His interests include computational thinking, Artificial Intelligence, and he loves Python programming. He is fond of creative writing, is an active blogger, and is also working on a non-fiction book project.

HOME

Struggling with the research and statistics concepts? Or looking for simple and innovative explanations to your research queries? At Statosphere Research and Statistics Consulting, we understand that statistics can be a challenging subject for many. Whether you're a student struggling with your coursework, a researcher needing assistance with data analysis, or a professional seeking to enhance your statistical skills, you can sure find help here. This blog by Dr. Prafulla Dikshit, PhD, offers creative, educational and interesting insights and how to's on statistical topics of value. See if you found what you were looking for. If not, you are welcome to post your queries in the comments, so Dr. Dikshit could address them in future posts, as feasible.

Thursday, July 14, 2022

Bibliometrics - A Cluster Analysis of Web of Science (WOS) Literature on Cybersecurity Risk Governance and Compliance Cooperation Using the VOS Viewer

                                        - by Dr. Prafulla Dikshit 

(4-6 minutes read)

Introduction

The Bibliometric and Bibliographic analysis is a promising new analytical methodology that derives its applications from the methodological realm of network analysis. One of the major applications of Bibliometrics is in understanding the publication dynamics within a given field and/or a specific focal area of research (Andersen 2018). Cybersecurity is one such emergent field with numerous challenges with the proliferation of the latest digital technologies, especially within the last five years. One of the major challenges facing the cybersecurity field is the lack of cooperation at various levels including industry, regional, national, and international, towards ensuring. This is primarily owing to a lack of a single or integrated framework for cybersecurity risk governance and compliance and the disjointed efforts by the various national governments towards achieving cybersecurity integration. One of the underlying factors is a trust deficit. It remains to be seen whether research on the topic points to a possibility and the modalities of such cooperation (Link, et al. 2018).

Method

I performed a bibliometric analysis based on a search query executed in the Web of Science database, for literature within the last 10 years which included peer-reviewed journal articles in the field. The database search yielded a collection of 282 high-quality peer-reviewed journal articles and corresponding bibliographic data was exported as a text file. The bibliographic data was exported to the VOS viewer and cluster analysis of the key terms within the title, abstract, and keywords of the articles within the collection were performed. Just to provide a broad idea of the search – the major terms within the original search query included – cybersecurity, protection, governance, risk, compliance, regulations, cooperation, firm, industry, national, and international.

Discussion

This analysis yielded a fascinating structure of keyword occurrence clusters within the topical area, as shown in Figure 1 below. The figure shows that there are broadly three keyword clusters within the literature on the topic as represented by the blue, green, and red colored node clusters. We can say that the keywords in a given cluster co-occur in a set of publications closely related thematically. The size of the node shows the occurrence frequency of an individual keyword, while its distance with another keyword node shows the relative co-occurrence and strength of association with the other keyword. A thematic aggregation of the green cluster shows a pattern of four prominent interconnected nodes – System-Requirement-Standard-Solution in the order of importance by occurrence. Herein the System node is the largest and it co-occurs with the three other major keywords and shows the direction the research in this cluster may be taking. The ‘system’ keyword node is closer to the ‘standard’ and solution’ nodes which are still closer to each other indicating standard solutions or solution standards within the cybersecurity application systems are being majorly researched within this cluster, and the same are being assessed for the requirements of the system as indicated by proximity of this system-standard-solution sub-cluster to the prominent yet slightly distant 'requirement node'. The other smaller and more distant nodes in the green cluster provide more context and granularity to the research area. For example, nodes like 'service', 'assessment', and 'operation' provide a firm and industry-level thrust to the theme of cybersecurity system solutions through the assessment of service operation requirements (Akanfe, Valecha, and Rao 2020; Rosado, et al. 2022; Wang, et al. 2020).

Similarly, the blue cluster represents a GDPR and Regulation-centric intellectual structure. GDPR is short for General Data Protection Regulation. The overarching keyword co-occurrence structure of the blue cluster is Regulation-Protection-Implementation-GDPR-Application-Device-Privacy-Challenge. Regulation is here the largest and the key node and the nearest prominent nodes are requirement and protection. This may be interpreted as a focus on protection requirements through regulation. Further, GDPR appears as the most prominent cybersecurity regulation and the application of the provisions of device-based privacy protection and its underlying challenges are likely the research focus. This is consistent with the cybersecurity challenges owing to new connected device system technologies like IOT (Jagannathan and Sorini 2015; Jideani, et al. 2018).    

The red cluster is represented primarily by a node named ‘study’ and the closest and largest node to the same is ‘threat’, followed by ‘research’, ‘practice', and 'context'.  This shows that there is a pertinent research effort to study the threats and the cybersecurity practices with a focus on their contexts (Hare 2016; Topping, et al. 2021).   

 Figure 1. Cluster diagram for the research on the topical area within the Cybersecurity domain.

Conclusion

Overall, the research in this area of cybersecurity cooperation as the problem area is focused on the lines of - Study of threats, and System requirements for Protection Regulation and implementation, to mitigate the threats as the solution side of the research. This article demonstrates the power of keyword cluster analysis in capturing the research insights from the visual aggregation of research themes and identifying the recent and upcoming research directions on the solution side of a broad research problem area.

***


 

References

Akanfe, Oluwafemi, Rohit Valecha, and Raghav H. Rao. 2020. "Assessing country-level privacy risk for digital payment systems." Computers & Security 99: 102065.

Andersen, Jan. 2018. "Chapter 6 - PreawardProject Preparation." In Research Management: Europe and Beyond, 147-171.

Hare, Stephanie. 2016. "For your eyes only: U.S. technology companies, sovereign states, and the battle over data protection." Business Horizons 59: 549-561.

Jagannathan, Srinivasan, and Adam Sorini. 2015. "A cybersecurity risk analysis methodology for medical devices." 2015 IEEE Symposium on Product Compliance Engineering (ISPCE). IEEE. 1-6.

Jideani, Paul, Louise Leenen, Bennet Alexander, and Jay Barnes. 2018. "Towards an electronic retail cybersecurity framework." 2018 International Conference on Advances in Big Data, Computing and Data Communication Systems (IcABCD). IEEE. 1-6.

Link, Jochen, Karl Waedt, Ben Ines Zid, and Xinxin Lou. 2018. "Current Challenges of the Joint Consideration of Functional Safety & Cyber Security, Their Interoperability and Impact on Organizations: How to Manage RAMS + S (Reliability Availability Maintainability Safety + Security)." 2018 12th International Conference on Reliability, Maintainability, and Safety (ICRMS). IEEE. 185-191.

Rosado, David G., Antonio Santos-Olmo, Luis Enrique Sánchez, Manuel A. Serrano, Carlos Blanco, Haralambos Mouratidis, and Eduardo Fernández-Medina. 2022. "Managing cybersecurity risks of cyber-physical systems: The MARISMA-CPS pattern." Computers in Industry 142: 103715.

Topping, Colin, Andrew Dwyer, Ola Michalec, Barnaby Craggs, and Awais Rashid. 2021. "Beware suppliers bearing gifts!: Analysing coverage of supply chain cyber security in critical national infrastructure sectorial and cross-sectorial frameworks." Computers & Security 108: 102324.

Wang, Di, Yan Zhu, Yi Zhang, and Guowei Liu. 2020. "Security Assessment of Blockchain in Chinese Classified Protection of Cybersecurity." IEEE Access 203440-203456.

 

 

Saturday, September 5, 2020

How to calculate PACF and produce its correlogram in MS excel


So in this previous post I explained how we can calculate and plot an ACF correlogram in excel. Now we move on to how PACF is calculated and plotted using excel. Firstly, lets review what Partial Autocorrelation Function or PACF is about and how is it different from ACF. While ACF is the correlation between k lagged values of a time series, Partial autocorrelation is a correlation between two k lagged datasets of a time series, Tk-p and Tk such that p is the number of intermediate lags between the two series, when the p lags are controlled for their effect on the kth lagged series and thus, the autocorrelation we get is a pure autocorrelation between the Tk-p and Tk series.
Also, it is a known fact that that the regression coefficients in a multiple regression represent correlation between the independent variables, while controlling for other variables. Thus, we use multiple regression analysis to calculate the PACF for the AAPL time series we have been working upon in this blog series.
First of all created 10 lagged data series as shown below:
 

While the above series is shown upto 21 data series points, we have considered 100 data points out of the total 365 data points for the series. It would be still better to choose the entire 365 points. First, we created the data series labeled from L1 to L10 by replacing the original data by one cell each successively as also shown above. So the first data series starts at sl. no. 0, the next one at 1 and so on upto 10. This creates cascaded data set, as also shown by the lower end of the dataset below:
 
However, the one lag cascaded parts at both the ends of the data are removed and suppressed by the yellow regions. This is done to ensure, we have a compatible lagged data series for multiple regression in excel.
Next, we ran multiple regression in excel using the command Data Analysis>Regression and then input the original AAPL series column data as the Y range variable and the lagged data in the column lagged data labeled L1 through L10 as the X range variables as shown below.
 

We receive the following output
 
Thus, we have the data for PACF for L1 through L10. in the coefficients  column.
This data is further plotted against the Lags 1 through 10 to arrive at the correlaogram for PACF
 
 


Thus, here we are with the correlaogram for the PACF function using excel.