Last Updated: 2023-05-10 00:09:39 UTC
by Russ McRee (Version: 1)
Part 2 of 2
In part 2 of our exploratory data analysis (EDA), a mission critical task underpinning the predominance of detection development and preparation for cybersecurity-centric machine learning, a remimder that there are a number of actions that analysts can take to better understand a particular data set and ready it for more robust utilization. That includes forecasting, which we will perform, again using the University of Maryland’s Center for International and Security Studies (CISSM) Cyber Attacks Database, an ideal candidate for experimental exploration. Per the dataset description, the database “brings together open-source information surrounding a range of publicly acknowledged cyber events on private and public organizations. Events from 2014 through present have been coded to standardize information on threat actor, threat actor country, motive, target, end effects, industry, and country of impact. Source links to the news source are also provided” (Harry & Gallagher, 2018).
We continue this exploratory data analytics journey with forecasting models and plots to show how you might predict future attack volumes.
Next, we model the CISSM CAD data for time series forecasting with three well known methods selected for performance with the CISSM CAD dataset. These include naive, SES, and ARIMA.
Naive forecasting uses the most recent observation as the forecast for the next observation.
Simple exponential smoothing (SES) is the method of time series forecasting used with univariate data with no trend and no seasonal pattern.
Autoregressive integrated moving average, or ARIMA, is a statistical analysis model that uses time series data to either better understand the data set or to predict future trends.
Before we initiate the forecasts we use each of the methods to determine which one performs best. For brevity here, we’ll only run the models and plots on exploitative data, but the forecasts_CISSM.R script and the Jupyter/Colab notebook in the GitHub repo run all models and plots with disruptive and exploitative data. Note again that these are subsets of a much broader dataset represented by CISSM CAD. Explore further to your liking.
Important to the modeling that follows, note the evaluation metrics RMSE and MAE. Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) are used to evaluate regression models, tell us how accurate our predictions are, and the amount of deviation from actual values (Acharya, 2021). In essence, the lower the score, the better the performance.
> naive_model_exploitative <- naive(exploitative, h = 12) > summary(naive_model_exploitative) Forecast method: Naive method Model Information: Call: naive(y = exploitative, h = 12) Residual sd: 20.5334 Error measures: ME RMSE MAE MPE MAPE MASE ACF1 Training set -0.06481481 20.5334 15.41667 -Inf Inf 0.6228308 -0.3349566 Forecasts: Point Forecast Lo 80 Hi 80 Lo 95 Hi 95 Feb 2023 23 -3.314606 49.31461 -17.24472 63.24472 Mar 2023 23 -14.214473 60.21447 -33.91463 79.91463 Apr 2023 23 -22.578235 68.57824 -46.70590 92.70590 May 2023 23 -29.629213 75.62921 -57.48944 103.48944 Jun 2023 23 -35.841249 81.84125 -66.98992 112.98992 Jul 2023 23 -41.457358 87.45736 -75.57902 121.57902 #snipped
The naive model yields an RMSE score of 20.5 and an MAE score 15.4.
> ses_model_exploitative <- ses(exploitative$Exploitative, h = 12) # RMSE = 18.8, MAE = 13.9 > summary(ses_model_exploitative) Forecast method: Simple exponential smoothing Model Information: Simple exponential smoothing Call: ses(y = exploitative$Exploitative, h = 12) Smoothing parameters: alpha = 0.529 Initial states: l = 32.3171 sigma: 19.0189 AIC AICc BIC 1157.442 1157.671 1165.517 Error measures: ME RMSE MAE MPE MAPE MASE ACF1 Training set 0.2068036 18.84358 13.92841 -Inf Inf 0.9034647 0.03522811 Forecasts: Point Forecast Lo 80 Hi 80 Lo 95 Hi 95 110 44.24184 19.8681717 68.61551 6.965531 81.51815 111 44.24184 16.6677635 71.81592 2.070929 86.41275 112 44.24184 13.8020028 74.68168 -2.311874 90.79555 113 44.24184 11.1837445 77.29994 -6.316154 94.79983 114 44.24184 8.7581582 79.72552 -10.025768 98.50945 115 44.24184 6.4880896 81.99559 -13.497539 101.98122 #snipped
The SES model yields an RMSE score of 18.8 and an MAE score of 13.9.
> arima_model_exploitative <- auto.arima(exploitative) > summary(arima_model_exploitative) Series: exploitative ARIMA(2,1,1) Coefficients: ar1 ar2 ma1 0.4744 0.1671 -0.9653 s.e. 0.1021 0.1007 0.0323 sigma^2 = 343.6: log likelihood = -467.68 AIC=943.35 AICc=943.74 BIC=954.08 Training set error measures: ME RMSE MAE MPE MAPE MASE ACF1 Training set 2.152378 18.19375 13.5603 -Inf Inf 0.5478338 -0.01933472
Finally, the ARIMA model yields an RMSE score of 18.2 and an MAE score of 13.6. Ultimately, by a small margin, the ARIMA model is most likely to provide the best forecast. Next, we forecast and plot the results. Given that ARIMA is most reliable under these circumstances, we’ll focus on visualizing ARIMA results; you can experiment with naive and SES plots on your own via the scripts or the notebook.
Forecasts & Plots
Generating a plot of AllEvents is as easy as:
Figure 6: CISSM CAD AllEvents plot
This is just as easy with disruptive or exploitative events exclusively with the likes of
To forecast the exploitative ARIMA model in an individual plot, utilize:
forecast(arima_model_exploitative) %>% autoplot()
Figure 7: CISSM CAD exploitative events forecast plot
The light and dark areas correspond to the 95% and 80% confidence intervals (CI) respectively.
You can join multiple plots to compare outcomes side by side as follows.
naiveEXP = forecast(naive_model_exploitative) %>% autoplot() sesEXP = forecast(ses_model_exploitative) %>% autoplot() arimaEXP = forecast(arima_model_exploitative) %>% autoplot() multi.pageEXP <- ggarrange(naiveEXP, sesEXP, arimaEXP, nrow = 3, ncol = 1) multi.pageEXP
Figure 8: CISSM CAD exploitative events multi-model forecast plot
You may be wondering what ARIMA(2,1,1) refers to in our plots. A nonseasonal ARIMA model, which this is, is classified as an “ARIMA(p,d,q)” model, where: p is the number of autoregressive terms, d is the number of nonseasonal differences needed for stationarity, and q is the number of lagged forecast errors in the prediction equation. Therefore, in this case, (2,1,1) is p,d,q found by the auto.arima process indicating that we have two auto-regessive terms, one difference, and one moving average term in our series (Nau, 2020).
Hopefully, this effort has been useful and insightful for security analysts as well as fledgling data scientists in the security realm. It’s no surprise that I orient towards the practices of visualization; I have found all methods deployed here to be useful, effective, and durable for future use. It is my desire that you benefit similarly, and that this opens some doors for you, literally and figuratively.
Cheers…until next time.
Last Updated: 2023-05-09 17:41:35 UTC
by Renato Marinho (Version: 1)
This month we got patches for 49 vulnerabilities. Of these, 6 are critical, and 2 are already being exploited, according to Microsoft.
One of the exploited vulnerabilities is a Win32k Elevation of Privilege Vulnerability (CVE-2023-29336). This vulnerability has low attack complexity, low privilege, and none user interaction. The attack vector is local, the CVSS is 7.8, and the severity is Important.
The second exploited vulnerability is Secure Boot Security Feature Bypass Vulnerability (CVE-2023-24932). According to the advisory, to exploit the vulnerability, an attacker who has physical access or Administrative rights to a target device could install an affected boot policy. The CVSS for this vulnerability is 6.7 and its severity is Important.
About the critical vulnerabilities, there is a Remote Code Execution (RCE) affecting Windows Network File System (CVE-2023-24941). According to the advisory, this vulnerability could be exploited over the network by making an unauthenticated, specially crafted call to a Network File System (NFS) service to trigger a Remote Code Execution (RCE). The advisory also details a mitigation procedure. The CVSS for this vulnerability is 9.8 – the highest for this month.
A second critical vulnerability worth mentioning is an RCE affecting Windows Lightweight Directory Access Protocol (LDAP) (CVE-2023-28283). According to the advisory, an unauthenticated attacker who successfully exploited this vulnerability could gain code execution through a specially crafted set of LDAP calls to execute arbitrary code within the context of the LDAP service. The attack complexity is high, which means that successful exploitation of this vulnerability requires an attacker to win a race condition. The CVSS for this vulnerability is 8.1.
See my dashboard for a more detailed breakout: https://patchtuesdaydashboard.com/
May 2023 Security Updates
|CVE||Disclosed||Exploited||Exploitability (old versions)||current version||Severity||CVSS Base (AVG)||CVSS Temporal (AVG)|
|AV1 Video Extension Remote Code Execution Vulnerability|
|Chromium: CVE-2023-2459 Inappropriate implementation in Prompts|
|Chromium: CVE-2023-2460 Insufficient validation of untrusted input in Extensions|
|Chromium: CVE-2023-2462 Inappropriate implementation in Prompts|
|Chromium: CVE-2023-2463 Inappropriate implementation in Full Screen Mode|
|Chromium: CVE-2023-2464 Inappropriate implementation in PictureInPicture|
|Chromium: CVE-2023-2465 Inappropriate implementation in CORS|
|Chromium: CVE-2023-2466 Inappropriate implementation in Prompts|
|Chromium: CVE-2023-2467 Inappropriate implementation in Prompts|
|Chromium: CVE-2023-2468 Inappropriate implementation in PictureInPicture|
|Microsoft Access Denial of Service Vulnerability|
|Microsoft Edge (Chromium-based) Elevation of Privilege Vulnerability|
|CVE-2023-29350||No||No||Less Likely||Less Likely||Important||7.5||6.5|
|Microsoft Edge (Chromium-based) Security Feature Bypass Vulnerability|
|CVE-2023-29354||No||No||Less Likely||Less Likely||Moderate||4.7||4.1|
|Microsoft Excel Remote Code Execution Vulnerability|
|Microsoft Office Remote Code Execution Vulnerability|
|Microsoft Remote Desktop app for Windows Information Disclosure Vulnerability|
|Microsoft SharePoint Server Information Disclosure Vulnerability|
|Microsoft SharePoint Server Remote Code Execution Vulnerability|
|Microsoft SharePoint Server Spoofing Vulnerability|
|Microsoft Teams Information Disclosure Vulnerability|
|Microsoft Word Security Feature Bypass Vulnerability|
|Remote Desktop Client Remote Code Execution Vulnerability|
|Remote Procedure Call Runtime Denial of Service Vulnerability|
|Secure Boot Security Feature Bypass Vulnerability|
|Server for NFS Denial of Service Vulnerability|
|SysInternals Sysmon for Windows Elevation of Privilege Vulnerability|
|Visual Studio Code Information Disclosure Vulnerability|
|Win32k Elevation of Privilege Vulnerability|
|Windows Backup Service Elevation of Privilege Vulnerability|
|Windows Bluetooth Driver Elevation of Privilege Vulnerability|
|Windows Bluetooth Driver Information Disclosure Vulnerability|
|Windows Bluetooth Driver Remote Code Execution Vulnerability|
|Windows Driver Revocation List Security Feature Bypass Vulnerability|
|Windows Graphics Component Elevation of Privilege Vulnerability|
|Windows Installer Elevation of Privilege Vulnerability|
|Windows Kernel Elevation of Privilege Vulnerability|
|Windows Lightweight Directory Access Protocol (LDAP) Remote Code Execution Vulnerability|
|Windows MSHTML Platform Security Feature Bypass Vulnerability|
|Windows NFS Portmapper Information Disclosure Vulnerability|
|Windows NTLM Security Support Provider Information Disclosure Vulnerability|
|Windows Network File System Remote Code Execution Vulnerability|
|Windows OLE Remote Code Execution Vulnerability|
|Windows Pragmatic General Multicast (PGM) Denial of Service Vulnerability|
|Windows Pragmatic General Multicast (PGM) Remote Code Execution Vulnerability|
|Windows SMB Denial of Service Vulnerability|
|Windows Secure Socket Tunneling Protocol (SSTP) Remote Code Execution Vulnerability|
|Windows iSCSI Target Service Information Disclosure Vulnerability|