What is PII?
Personally Identifiable Information (PII) is the data that could identify a specific person and identity.
What is included in PII?
It includes Personally Identifiable Information, it varies according to your country, but usually include the following:
- Mobile number
- Phone number
- Physical Address
- Email address
- Aadhaar number
- Pan number
- Salary amount
- Social security number
- National ID number
- Session cookies
How to find PII data in Splunk logs??
To find the PII data start with a basic query like index=test “*@gmail.com”. You will get the output with the Gmail IDs, now start finding the variable names that the company is using to define the PII data like “emailAddress”, “Phonenumber”, etc from the output. In the same way, we have to search for different variable names that are used in the logs for defining the PII data.
When we were searching for the PII data, we have found some variable names that companies are mostly using for defining the PII data which are mentioned below. You can use that variable names to craft the query for finding the PII data.
How to find all service names and loggers associated with the service names?
Add email ids like below, you will get all the services and logger for the mentioned email ids:
index=test "@gmail.com" OR "@hotmail.com" OR "@outlook.com" | stats count by a, logger
Mostly used variable names in companies:
- email address
- Order amount
Note: All the variable names are not PII data but it helps you to find the data in the logs.
Examples of Splunk PII data dorks:
- index=test “*@gmail.com”
- index=test “*@gmail.com” AND a=”ServiceName”
- index=test “* @ *.com” (Remove space while copying between both *)
- index=test “billingAddress”
- index=test “billingAddress” AND a=”ServiceName”
- index=test “billingAddress” AND a=”ServiceName” AND a!=”Servicename1" [a!=”Servicename1" → This define that don’t search for service Name1]
- index=test “mobilePhone”
- index=test “@gmail.com” OR “@hotmail.com” OR “@outlook.com” OR “@hotmail.co.uk” OR “@yahoo.com”
- Whenever you search for the PII data, try to filter out the service name first and then filter out the PII data by service name. It helps for refining the results for the particular service. Example:
Step 1: search index=test “Address” in the Splunk
Step 2: On the left-hand side the “a” is mentioned, click on “a”.
Step 3: All the service names will be visible now.
Step 4: Now search PII data only for that service name. index=test “Address” AND a=”ServiceName”
- The PII data should not be mentioned in the URL, craft a query in such a way so that the PII data should not be included in the URL that will be used in the report.
- Refine the query by using unique values or variable names like index=test “emailAddress” “bankAccount” AND a=”Servicename” AND logger=”com.xxx.store.yyy.vvv”