Advancing Security Data Management with Amazon Security Lake

October 11, 2023
Christian Almenar
Darwin Salazar

Today, security teams face the Herculean task of managing a myriad of security solutions, each with its own data output schema, while also dealing with a constant stream of alerts and complex regulatory challenges. Rising to meet this challenge, the integration between Monad and Amazon Security Lake, now available in AWS Marketplace, offers a robust solution. Amazon Security Lake automatically centralizes security data from AWS environments, SaaS providers, on premises, and cloud sources into a purpose-built data lake stored in your AWS account. Monad streamlines the complex Extract, Load, Transform (ELT) processes essential in today’s multi-cloud environment, sending security findings to Amazon Security Lake. This integration empowers security, DevOps and governance, risk, and compliance (GRC) teams to focus on leveraging their security data effectively, staying ahead of sophisticated adversaries and fostering a more collaborative security culture.

Expanding OCSF Compatibility with Monad's Data Transformation

Security teams can now leverage Monad to seamlessly convert data from leading security solutions (including for example, Tenable VM, CrowdStrike Falcon Spotlight, Snyk, Semgrep, Qualys Web Application Scanning, and many more) into the Open Cybersecurity Schema Framework (OCSF) and deliver it directly to Amazon Security Lake for retention and analysis by your team. This helps teams streamline their processes and manage their security data more efficiently. For more on the importance of OCSF, check out Jacolon Walker’s (co-founder and CTO of Monad) article on OCSF and the Future of Security Data Modeling

Enhancing Amazon Security Lake with Monad

As security teams aim to build a more robust security data lake, Monad emerges as a key ally. Monad facilitates the integration of more transformed data, creating a rich and diverse data repository. This is achieved by ELT'ing third party security solution data so it can be analyzed alongside AWS-native logs such as AWS CloudTrail management and data events, Amazon Virtual Private Cloud (Amazon VPC) Flow Logs, and AWS Security Hub findings. By focusing solely on security findings, Monad ensures that teams have access to the most pertinent and critical data, enhancing their ability to safeguard their environments.

A Practical Use Case: Leveraging Amazon Security Lake and Monad to correlate Amazon VPC Flow Logs with Tenable VM cloud scan data to understand vulnerability exposure

To demonstrate the value of using Monad and Amazon Security Lake together, we’re excited to showcase the below use case.

The Problem: When performing security activities like threat hunting, it’s critical to understand what potential malicious activity may be taking place on your networks. However, with the enormous amount of activity that network traffic can generate, it can be difficult to identify what might be attacker activity versus what is normal network usage.

The Approach: To help a security analyst sift through the noise and ultimately make it easier to identify potential Internet Protocol (IP) addresses worth investigating, a user can deploy Monad to ingest Tenable VM findings, convert those into OCSF format, and load that transformed data into Amazon Security Lake. Once Monad has transformed and moved the Tenable data, a user can leverage Amazon Athena to query the data stored in Amazon Security Lake and join that information with Amazon VPC Flow Logs, picking out the IP addresses with the most amount of traffic to our vulnerable instances. This narrows down potential attackers to a set of IP addresses known to be attempting to connect to vulnerable instance(s). These results can be narrowed down further by filtering out any blocked traffic and only focus on attacks that were (possibly successful). 

The Solution: Leveraging the approach above, let’s dive into the solution depicted below in a Monad-Amazon Security Lake user journey. 

Commentary

With our first CTE (Common Table Expression), we can pick out a particular instance ID from our Amazon VPC Flow Logs and gather some common information.  We alias some columns to more friendly names to make queries easier to read.

with
target_instance as 
(SELECT 
src_endpoint.ip as src_ip,
	dst_endpoint.ip as dst_ip,
  dst_endpoint.instance_uid as instance_id,
  traffic.packets as packets,
  traffic.bytes as bytes,
  disposition as result
FROM "amazon_security_lake_glue_db_us_west_2"."amazon_security_lake_table_us_west_2_vpc_flow_1_0"
WHERE dst_endpoint.instance_uid = 'i-035ca199d6c21a7ee'),

Commentary

Now we create a new CTE to collect basic information about vulnerable instances scanned by Tenable. We also filter for instances with vulnerabilities that have a severity >= 4 to narrow our search down to the highest priority instance.

NOTE: Monad currently only supports security findings as a custom source in Amazon Security Lake. Any other data type requires an additional custom source to be setup.

tenable_ocsf as 
(select
	observables[1].name as instance_id,
  severity_id,
  message
FROM "amazon_security_lake_glue_db_us_west_2"."amazon_security_lake_table_us_west_2_ext_monad"
where severity_id >= 4),

Commentary

Using the first CTE we created, we create a friendly named CTE to sum up the total traffic received by our instances and give the instance_id a name of nw_instance_id to be able to distinguish it in a later join.

network_traffic as
(select
	src_ip,
  dst_ip,
  instance_id as nw_instance_id,
  result,
  sum(bytes) as total_bytes
from target_instance
group by src_ip, dst_ip, instance_id, result
order by total_bytes desc),

Commentary 

Using our previous CTEs of tenable_ocsf and network_traffic, we join our data together to create another CTE in which we match up our original instance id with our network_traffic data.

attacker_ips as
(select * from tenable_ocsf
join network_traffic on tenable_ocsf.instance_id = network_traffic.nw_instance_id
order by network_traffic.total_bytes desc)

Commentary 

Finally, we filter out duplicates from the attacker_ips CTE and order the whole thing by which instances sent the most traffic to our instance, also filter for only results where the traffic was allowed.  With all that done, we are now ready to dive into IP analysis and ruling out any false positives!

select
distinct(src_ip),
total_bytes,
instance_id,
resultfrom attacker_ips
where result = 'Allowed'
order by total_bytes desc

Running our query, we can find a good number of potential IPs we need to investigate!

Impact: In this use case, we demonstrated how leveraging Monad and Amazon Security Lake together can enable security analysts to understand vulnerability exposure better and more quickly.

Where Do We Go From Here? 

Monad’s integration with Amazon Security Lake frees security teams from the hassle of wrestling with log parsing, maintaining API integrations, and other complex ELT issues. We are excited for Amazon Security Lake users to explore the benefits that Monad offers. Get started by trying Monad Basic, which gives teams the full functionality of Monad for free for the first million rows of security data or the first month, whichever comes first. And be on the lookout for a follow-up webinar to this post. Please reach out to us at hello@monad.com if you have any questions on this use case! Let’s build a safer, more secure future together.