The size and value of data is increasing day by day, and where the data is and who accesses it has become the most important job of data security. While we can make data discovery with data lost prevention (DLP) products, let's examine how we can do this in our aws s3 buckets.
Amazon Macie discovers sensitive data using machine learning and pattern matching, provides visibility into data security risks, and enables automated protection against those risks. Macie work with EventBridge and Therefor you can take action ( for example using with Lambda, SNS ).
What we can discovery ?
PII
Credirt Cart
Aws Account
Bank Account
IBAN, Tax ID ...
what you need ( with Regular Expression )
Scenario
There is a bucket called "s3macie" and we will upload a text containing a Turkish Identity number here. Then we create a job by enabling Macie and wait for it to find it in the scan.
We created a bucket named s3macie and enable Macie.
We create job ( Discover data )
Choose S3 Buckets
Step: Refine the Scope ( Schedule job, include , exclude)
Step: Select Managed Data Identifiers ( select pattern or list of custom regular expression part)
we want to discover Turkish Identity Number .That's why i go with "custom"
List of custom pattern . We create new condition here
Create new
I write regex and test Turkish Identity Number ( rigt side )
Back to "Select custom data identifiers"
Step: Enter General Setting enter value and "Next"
End we create the job
We upload the file ( include test Turkish Identity Number
Finally We find critical data on S3 buckest with Macie