Test Case Generation

Introduction

This article is a review of using AI tools to generate cybersecurity compliance test cases that create evidence of compliance to a cybersecurity standard (the US Department of War (DoW) Cybersecurity Maturity Model Certification (CMMC) standards used in these scenarios). This is the third blog article in a series of applied AI technology as aids to develop a cybersecurity compliance program. An overview document providing the background for this blog series can be found here:

(https://www.complianceautomationenineeirng.com/blogoverview)

That article provides the scenario that is the basis for this blog series and information about the environment used for this use case.

Disclaimer: All materials created in this blog series were developed solely by, are the opinions of the author and are provided with no express or implied warranty. Other than outputs generated by the AI models used in this exercise, no AI was used to prepare this blog series (in case you were wondering). The author accepts no liability for the use of any of these materials by the reader. We strongly encourage engaging appropriate professionals for developing materials for use within your own organization.

Cybersecurity Compliance Test Cases

Test cases can be used as a means to generate and collect evidence of compliance to a cybersecurity standard. Compliance to many cybersecurity requirements can be demonstrated through computer device configuration settings and tests to collect information about cybersecurity related systems. Although test cases per se are not an explicit cybersecurity requirement of the CMMC standards, programs in the form of scripts can be easily prepared that read and report device settings which can satisfy a given requirement. Further, once written these scripts can be reused to monitor the state of the system under review providing a form of continuous cybersecurity monitoring, an acknowledged best practice.

An example:

CMMC Account Control (AC) requirement AC.L2-3.1.8 – Unsuccessful Logon Attempts states “Limit unsuccessful logon attempts” (ref. CMMC Assessment Guide Level 2, Version 2.13, September 2024 (ZRIN 0790-ZA19).

To further clarify the requirement, the associated assessment objectives consist of:

Determine if:

[a] the means of limiting unsuccessful logon attempts is defined; and

[b] the defined means of limiting unsuccessful logon attempts is implemented.

This first objective [a] is a documentation related requirement, meaning that somewhere in the organization’s technical documentation (policy / procedure, System Security Plan (SSP), internal standards / guidelines, etc.) there should be a statement on how the organization implements control of unsuccessful logon attempts.

Most likely this will refer to a configuration setting that controls the maximum of unsuccessful logon attempts. In the case of Microsoft Windows 11, the following registry setting controls that parameter:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\RemoteAccess\Parameters\AccountLockout

In organizations systems documentation, a value would be specified, perhaps set to 5 maximum unsuccessful attempts. To meet objective [a], the documentation would specify where this setting is documented and its required setting.

The second objective [b] is a natural candidate for an automated test case. It is a typical example of an objective that can be easily satisfied with a script.

This test case would read the registry setting and provide it as evidence that the requirement is met, assuming it is set to 5 per the organizations documented specification. The following Windows powershell command reports the current setting of the logon attempts registry setting:

net accounts

Here is an example result of that command showing that the Lockout threshold amount is set to 10 which would fail the requirement and need to be set to 5 to meet the organizations standard:

Using the above command as its basis, an automated script can extract the Lockout Threshold value from the generated output and report the result. This provides the raw data (the current value of the Lockout threshold setting in this case) for addressing the control objective. This serves as a data point for a compliance assessment.

A second step can be to compare the retrieved value to the required setting and declare whether the objective is met or not. This assumes that the required value setting is available in electronic form and not embedded in a document (see the discussion “A few words about ODP’s” following).

By their nature, test cases are operating system environment (OS) and version specific. Most organizations have a mixture of operating systems for their devices (some may be firmware based). Commands to provide test case results need to be generated for the correct OS, version and device type. Some devices may be networking appliances (e.g., CISCO), some may be office workstations (e,g,, Microsoft Windows or Apple Macintosh OS) while others may be system servers (possibly Linux based).

Each of these environments require expertise to know what commands to use and how to use them for test cases. This expertise most likely resides in different individuals or groups within an organization, or may reply upon vendors to provide. Using AI tools to generate test cases can help compliance analysts, who may not be experts in all systems environments, by providing start points to create test cases and identify parameter settings for specific systems related data. This can greatly expedite their development. When needed, experts on specific environments can be consulted to validate meaning of specific parameter settings. This is much more efficient than having valuable resources tied up with creating test cases and leverages their expertise.

A few words about ODP’s (Organizational Defined Parameters): This blog series was developed using the current in effect version of the CMMC standards, Version 2 by the US Department of War (DoW). These standards are based on the NIST SP800-171, Version 2 control standards. The standards have been updated and replaced by the Version 3 standard by NIST. One of the improvements of the Version 3 standard is the formal introduction of Organizational Defined Parameters (ODP’s) (ref. NIST SP800-171A, Rev.3 ). 87 ODP’s have been defined in the NIST SP800-171 Version 3 standards.

One of the ODP’s is defined as:

A.03.01.08.ODP[01]: the number of consecutive Invalid logon attempts by a user allowed during a time period is defined.

If the organization has developed the assigned settings for its ODP’s they can be used directly in test cases like the one above. Parameterized ODP values can be used in test cases to make evaluation of objectives easier, more precise and facilitate automating compliance reporting. Even though the current in effect CMMC (based on NIST SP800-171Rev.2) standard does not include ODP’s, they can be used now to expedite test case development and compliance assessments and should be considered a best practice. Kudos to NIST for these types of improvements.

Inputs

The following input files were used to provide contextual information to ChatGPT V5 for test case generation.

ACObjectives.csv – the list of control objectives for generating test cases. Note that this list of control objectives is for the Access Control (AC) family only, and does not include the objectives for the rest of the CMMC control objectives.

ANCyberAssets.csv – the list of assets that are the targets for the tests

AlphaNineDescription.doc – The description of the environment for which the test cases are to be generated.

Prompt

Following is the initial ChatGPT prompt used for the test case generation scenario. The full chat sequence is lengthy and can be seen here.

The initial response from the above prompt failed to produce a usable result. ChatGPT attempted to generate one script per objective (in the AC control family there were a total of 70 objectives) one each for the 45 cyber assets. That would result in 2,760 total scripts scripts).

ChatGPT gave up and stopped after generating 684 scripts presumably because of AI model token limitations. This behavior of simply stopping a generation task (be it code generation or document generation, etc.) is typical of Large Language AI Models (LLMs). When they run out of memory space to process tokens needed, they tend to just give up and stop.

In addition, when reviewing the generated scripts the vast majority appeared to be generic Linux style bash scripts and not specific to the target operating environments of the device asset list. Lesson learned, be careful what you ask for (or more specifically, this is a case of a need for further prompt engineering).

To remedy the large volume generation issue, I added this next prompt in the ChatGPT dialog:

Please make the following changes to the tests and the test generation:

Please generate tests on a per operating system environment basis. This means that there should be one test per operating system (OS) and version (OSVer) for each control objective. Note that one test may address multiple objectives in which case one test should be generated and marked as addressing the multiple objectives. If it is easier to manage, please generate the actual individual test code for each test (per OS environment) instead of using the test code generator approach.

For each generated test case, please record the Operating system, the Operating system version, the control objective number met, the control number met and include the description associated with the control objective so that auditing of test runs can associate the test results with the corresponding control objective.

The result was a set of scripts organized by operating system and objective. The generated scripts were now OS appropriate and very much closer to the intended result. There were some anomalies in the Macintosh and CISCO OS versions of the scripts. In addition, as an enhancement, I added a request to generate explanations on how a given script addresses the control objective. These changes were added were addressed with the dialog prompt:

The generated scripts for MacOS appear to use an older version that uses outdated commands. For example, the generated MACOS test for AC.L2-3.1.21, uses the lastlog command, but the current, correct command is “log” and requires different command parameters. Second example, test AC.L2-3.1.19 generated for MACOS uses the lsblk command that is not available on MacOS (lsblk is generic linux) – macos uses the diskutil command. Third example, the MACOS test for AC.L2-3.1.17 uses the nmcli command that is not a current MACOS command. Is it possible to generate commands that use the current supported MACOS commands?

The generated scripts that are for CISCO ASA appear to be generic linux commands, not specific CISCO ASA operating system commands. As such, they will not work on cisco operating systems. Please correct to use CISCO ASA specific commands. Refer to the CISCO command reference for list of valid CISCO ASA operating system commands:

https://www.cisco.com/c/en/us/td/docs/security/asa/asa-cli-reference/A-H/asa-command-ref-A-H/m_da-dg.html

Please add comments to the generated test cases that explains how the test achieves the objective(s). For example, if the objective states “Determine if the execution of privileged functions is captured in audit logs”, the explanation should be something like: “audit logging is configured to capture privileged functions because <explain>” and indicate that the test was passed successfully. The explanation should be included as comments in the generated test scripts. In the case where an objective is not met indicate the test failed.

ChatGPT responded with:

Response

Absolutely — here’s a clean regeneration of test cases that:

Adds HowTestMeetsObjective (and tightens SuccessCriteria) explaining how each test proves/doesn’t prove the objective.
Writes the same explanation into each generated script’s comments and also prints PASS/FAIL + explanation into results.txt.
Modernizes macOS commands (no lsblk, nmcli, lastlog misuse; uses fdesetup, log, networksetup, systemsetup, auditconfig, etc.).
Introduces Cisco ASA targeting: if an asset’s OS contains ASA or Cisco ASA, it generates an ASA command pack (to be executed from a jump host via SSH or your agent’s Netmiko runner). These are ASA-native show/more system:running-config queries (not GNU/Linux commands). You can flip between “raw CLI file” or a ready-to-run Python/Netmiko wrapper by a flag.

The results from this prompt were very good and addressed the issues noted above.

Review of Results

In total 171 test cases were generated organized by CMMC standard and OS. Each script includes as comments the following:

Control: The CMMC requirement control ID for the script (e.g., AC.L1-3.1.1
OS: The OS target for the script
OSVer: The OS version for the sript
TestKey: The parameter that forms the test key for the sctipt
Objective: The CMMC objective for the script
HowThisTestMeetsObjective: A description of how the script meets the objective being tested
SuccessCriteria: The criteria that is used to determine if the test was a success or not (objective met or not met)

The comments are followed by the executable script code.

Here is the generated Windows 11 script for the above requirement objective [b]: the defined means of limiting unsuccessful logon attempts is implemented for CMMC control requirement AC.L2-3.1.8: Unsuccessful Logon Attempts:

# Control: AC.L2-3.1.8

# OS: Windows

# OSVer: 23

# TestKey: lockout_policy

# Requirement: AC.L2-3.1.8

# HowThisTestMeetsObjective: Unsuccessful login lockout is enforced. Windows via ‘net accounts’ & security policy; Linux via pam_faillock; macOS via ‘pwpolicy’ global policy.

# SuccessCriteria: results.txt must show lockout thresholds/durations configured to organization policy.

# Objective Descriptions:

# – 3.1.8[a] – the means of limiting unsuccessful logon attempts is defined

# – 3.1.8[b] – the defined means of limiting unsuccessful logon attempts is implemented

# How: Unsuccessful login lockout is enforced. Windows via ‘net accounts’ & security policy; Linux via pam_faillock; macOS via ‘pwpolicy’ global policy.

# Criteria: results.txt must show lockout thresholds/durations configured to organization policy.

“net accounts” | Out-File results.txt

secedit /export /cfg sepolicy.inf | Out-Null

Select-String -Path sepolicy.inf -Pattern “LockoutBadCount|ResetLockoutCount|LockoutDuration” | Out-File -Append results.txt

$cfg = Get-Content sepolicy.inf

$bad = $cfg -match “LockoutBadCount\s*=\s*(\d+)”

$passed = $bad -and ($matches[1] -as [int]) -ge 3

$explain = “Unsuccessful login lockout is enforced. Windows via ‘net accounts’ & security policy; Linux via pam_faillock; macOS via ‘pwpolicy’ global policy. “ + ($(if ($passed) {“Test passed—account lockout thresholds present.”} else {“Objective not met—no lockout thresholds found.”}))

$stat = @{Status=($(if($passed){“success”}else{“error”})); Explanation=$explain}

$stat | ConvertTo-Json | Out-File status.json

Executing this script (as a Windows Powershell script) produces the following output:

(from file results.txt)

net accounts

sepolicy.inf:9:LockoutBadCount = 10

sepolicy.inf:10:ResetLockoutCount = 10

sepolicy.inf:11:LockoutDuration = 10

(from file status.json)

{

“Explanation”: “Unsuccessful login lockout is enforced. Windows via \u0027net accounts\u0027 \u0026 security policy; Linux via pam_faillock; macOS via \u0027pwpolicy\u0027 global policy. Test passed-account lockout thresholds present.”,

“Status”: “success”

}

Observations

The process of using AI to generate compliance test cases is an iterative exercise in gaining the desired results. AI tools such as ChatGPT are effective at generation tasks generally, however due to the more complex nature of cybersecurity compliance, a simple prompt such as “Please generate a humorous story about three bears and a little blond girl” is not enough to create the desired outcome. It takes several iterations of dialog with the AI engine to understand how it responds and what it needs in order to accomplish the task.

Context is critical to the process. Sufficient quantity of contextual information must be made available to the AI tool so that it can create the correct output. It is valuable to remember that the AI engine has no prior information about the problem to be solved, or environment in which to operate. Identifying the perquisite information goes hand in hand with the iterative process wherein when you see the results of a given prompt, it becomes evident what additional or modified information is needed.

As previously stated, Organizational Defined Parameters (ODP’s) provide key parameters that the organization sets for control objectives. The example here is ODP 01 of requirement A.03.01.08:

A.03.01.08.ODP[01]: the number of consecutive invalid logon attempts by a user allowed during a time period is defined.

ODP’s were added to the NIST SP800-171 standard with version 3. The current enforced standard is Version 2. In this code generation exercise, the list of ODP’s were added as input to ChatGPT. In the version 2 standard for this example, there is no mention of “during a time period”. ChatGPT picked up this detail for the version 3 code generation test case and added it to the results.

Conclusions

Effective results can be obtained through the application of AI technology to the task of generating compliance test cases. Given that there is potentially a large number of test cases required to generate the evidence needed for a thorough assessment, generating test cases provides an effective means to reduce the time and effort required to create them. Generating test cases has high value for cybersecurity compliance assessments and can substantially reduce the time and cost to conduct them.

Appendix A - Samples of Generated Test Cases

TestCasesGenerated

Standard	Requirement	Environment	Test Command(s)	Test Script Result
ACL2-3.1.1		Windows 11	Get-LocalUser \| Select Name,Enabled,LastLogon \| Sort Name \| Format-Table -AutoSize > results.txt net user guest >> results.txt 2>&1 (Get-ItemProperty ‘HKLM:\SYSTEM\CurrentControlSet\Control\Terminal Server\WinStations\RDP-Tcp’).UserAuthentication \| Out-File -Append results.txt	Name Enabled LastLogon ---- ------- --------- Administrator False DefaultAccount False Guest False Michael Davidson True 11/5/2025 7:57:58 AM WDAGUtilityAccount False 1
ACL2-3.1.1		CISCO ASA	# Execute these commands on ASA and capture output: show running-config show logging show ssh show http show telnet show aaa-server show aaa authentication show vpn-sessiondb anyconnect more system:running-config \| inc privilege\|username\|aaa\|ssh\|telnet\|http	Unable to test
ACL2-3.1.5	Employ the principle of least privilege, including for specific security functions and privileged accounts.	Linux	set -euo pipefail id > results.txt echo ‘{{“Status”:”success”,”Explanation”:”Administrative group membership is constrained; privilege escalation requires auth. Test enumerates admin groups/sudo NOPASSWD exceptions. Generic evidence captured.”}}’ > status.json	uid=1000(mdavidson) gid=1000(mdavidson) groups=1000(mdavidson),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),1 00(users),116(lpadmin)
ACL2-3.1.5		Windows 11	REM ACL2-3.1.5-least_privilege-Windows.bat net localgroup administrators > results.txt whoami /groups >> results.txt	Alias name administrators Comment Administrators have complete and unrestricted access to the computer/domain Members Administrator Michael Davidson The command completed successfully. GROUP INFORMATION Group Name Type SID Attributes ============================================================= ================ ============ =============================================================== Everyone Well-known group S-1-1-0 Mandatory group, Enabled by default, Enabled group NT AUTHORITY\Local account and member of Administrators group Well-known group S-1-5-114 Mandatory group, Enabled by default, Enabled group BUILTIN\Administrators Alias S-1-5-32-544 Mandatory group, Enabled by default, Enabled group, Group owner BUILTIN\Users Alias S-1-5-32-545 Mandatory group, Enabled by default, Enabled group NT AUTHORITY\INTERACTIVE Well-known group S-1-5-4 Mandatory group, Enabled by default, Enabled group CONSOLE LOGON Well-known group S-1-2-1 Mandatory group, Enabled by default, Enabled group NT AUTHORITY\Authenticated Users Well-known group S-1-5-11 Mandatory group, Enabled by default, Enabled group NT AUTHORITY\This Organization Well-known group S-1-5-15 Mandatory group, Enabled by default, Enabled group NT AUTHORITY\Local account Well-known group S-1-5-113 Mandatory group, Enabled by default, Enabled group LOCAL Well-known group S-1-2-0 Mandatory group, Enabled by default, Enabled group NT AUTHORITY\NTLM Authentication Well-known group S-1-5-64-10 Mandatory group, Enabled by default, Enabled group Mandatory Label\High Mandatory Level Label S-1-16-12288
xcACL2-3.1.5		CISCO ASA	# OS: Cisco ASA # TestKey: least_privilege # HowThisTestMeetsObjective: Administrative group membership is constrained; privilege escalation requires auth. Test enumerates admin groups/sudo NOPASSWD exceptions. # SuccessCriteria: results.txt must show minimal admin members and absence of broad NOPASSWD grants. # Execute these commands on ASA and capture output: show running-config \| include username\|privilege show privilege	utt
ACL2-3.1.7	Non-priviledged account use	Windows 11	reg query "HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System" /v EnableLUA > results.txt reg query "HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System" /v ConsentPromptBehaviorAdmin >> results.txt	HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System EnableLUA REG_DWORD 0x0 HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System ConsentPromptBehaviorAdmin REG_DWORD 0x5
ACL2-3.1.7	Non-priviledged account use	Linux	#!/usr/bin/env bash # How: Audit logging is configured to capture privileged function execution (e.g., sudo/privileged exec). The test checks platform audit settings (Windows: Advanced Audit Policy Privilege Use; Linux: auditd execve rules for euid=0; macOS: auditd policy argv/arge). # Criteria: results.txt must contain audit settings capturing privileged-use/exec and rule lines or policy indicating capture. set -euo pipefail : > results.txt echo "== audit rules ==" >> results.txt auditctl -l >> results.txt 2>/dev/null \|\| true grep -R "execve" /etc/audit/rules.d 2>/dev/null >> results.txt \|\| true echo "== sudo logging ==" >> results.txt grep -R "logfile" /etc/sudoers /etc/sudoers.d 2>/dev/null >> results.txt \|\| true blob=$(tr '[:upper:]' '[:lower:]' < results.txt) if echo "$blob" \| grep -E -q 'execve\|always,exit.euid=0\|sudo.logfile'; then echo "PASS: privileged exec is audited or sudo is logging" >> results.txt echo '{{"Status":"success","Explanation":"Audit logging is configured to capture privileged function execution (e.g., sudo/privileged exec). The test checks platform audit settings (Windows: Advanced Audit Policy Privilege Use; Linux: auditd execve rules for euid=0; macOS: auditd policy argv/arge). Test passed—auditd execve (root) and/or sudo logging present."}}' > status.json else echo "FAIL: no privileged exec audit or sudo logging found" >> results.txt echo '{{"Status":"error","Explanation":"Objective not met—no audit rules or sudo logging for privileged exec."}}' > status.json fi	== audit rules == == sudo logging == FAIL: no privileged exec audit or sudo logging found
ACL2-3.1.7	Non-priviledged account use	Cisco ASA	# Execute these commands on ASA and capture output: show logging show aaa accounting more system:running-config \| include logging
ACL2-3.1.21		Windows 11	Get-LocalUser \| Select Name,Enabled,LastLogon \| Sort Name \| Out-File results.txt "net accounts" \| Out-File -Append results.txt try { Get-ADUser -Filter 'enabled -eq $false' -Properties whenChanged,LastLogonDate \| Select SamAccountName,Enabled,whenChanged,LastLogonDtate \| Out-File -Append results.tx } catch { "AD module not available" \| Out-File -Append results.txt }	Name Enabled LastLogon ---- ------- --------- Administrator False DefaultAccount False Guest False Michael Davidson True 11/5/2025 7:57:58 AM WDAGUtilityAccount False net accounts
ACL2-3.1.21		Cisco ASA	# OS: Cisco ASA # TestKey: account_management # HowThisTestMeetsObjective: Accounts are managed with appropriate lifecycles. Test lists disabled/stale accounts and password aging settings. # SuccessCriteria: results.txt must contain disabled accounts, lastlog insights, and aging configuration. # Execute these commands on ASA and capture output: show running-config show logging show ssh show http show telnet show aaa-server show aaa authentication show vpn-sessiondb anyconnect more system:running-config \| inc privilege\|username\|aaa\|ssh\|telnet\|http	Unable to test