Skip to content

Commit 3331b27

Browse files
authored
Merge pull request #13 from PayU/fix-edge-case-json-parse
feat: add special escaped cases
2 parents 989c44d + 48ae41d commit 3331b27

File tree

4 files changed

+64
-52
lines changed

4 files changed

+64
-52
lines changed

README.md

Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,32 +2,36 @@
22

33
[![Known Vulnerabilities](https://snyk.io//test/github/PayU/fluent-plugin-masking/badge.svg?targetFile=Gemfile.lock)](https://snyk.io//test/github/PayU/fluent-plugin-masking?targetFile=Gemfile.lock) [![Build Status](https://travis-ci.com/PayU/fluent-plugin-masking.svg?branch=master)](https://travis-ci.com/PayU/fluent-plugin-masking)
44

5-
## Overview
5+
# Overview
66
Fluentd filter plugin to mask sensitive or privacy records with `*******` in place of the original value. This data masking plugin protects data such as name, email, phonenumber, address, and any other field you would like to mask.
77

8-
## Requirements
8+
# Requirements
99
| fluent-plugin-masking | fluentd | ruby |
1010
| --------------------- | ---------- | ------ |
1111
| 1.2.x | >= v0.14.0 | >= 2.5 |
1212

1313

14-
## Installation
14+
# Installation
1515
Install with gem:
1616

17-
`gem install fluent-plugin-masking`
17+
`fluent-gem install fluent-plugin-masking`
1818

19-
## Setup
19+
# Setup
2020
In order to setup this plugin, the parameter `fieldsToMaskFilePath` needs to be a valid path to a file containing a list of all the fields to mask. The file should have a unique field on each line. These fields **are** case-sensitive (`Name` != `name`).
2121

22-
In addition, there's an optional parameter called `fieldsToExcludeJSONPaths` which receives as input a comma separated string of JSON fields that should be excluded in the masking procedure. Nested JSON fields are supported by `dot notation` (i.e: `path.to.excluded.field.in.record.nestedExcludedField`)
23-
The JSON fields that are excluded are comma separated.
24-
This can be used for logs of registration services or audit log entries which do not need to be masked.
25-
This is configured as shown below:
22+
### Optional configuration
23+
- `fieldsToExcludeJSONPaths` - this field receives as input a comma separated string of JSON fields that should be excluded in the masking procedure. Nested JSON fields are supported by `dot notation` (i.e: `path.to.excluded.field.in.record.nestedExcludedField`) The JSON fields that are excluded are comma separated.
24+
This can be used for logs of registration services or audit log entries which do not need to be masked.
25+
26+
- `handleSpecialEscapedJsonCases` - a boolean value that try to fix special escaped json cases. this feature is currently on alpha stage (default: false). for more details about thoose special cases see [Special Json Cases](#Special-escaped-json-cases-handling)
27+
28+
An example with optional configuration parameters:
2629
```
2730
<filter "**">
2831
@type masking
2932
fieldsToMaskFilePath "/path/to/fields-to-mask-file"
3033
fieldsToExcludeJSONPaths "excludedField,exclude.path.nestedExcludedField"
34+
handleSpecialEscapedJsonCases true
3135
</filter>
3236
```
3337

@@ -38,7 +42,7 @@ email
3842
phone/i # the '/i' suffix will make sure phone field will be case insensitive
3943
```
4044

41-
## Quick Guide
45+
# Quick Guide
4246

4347
### Configuration:
4448
```
@@ -98,10 +102,12 @@ echo '{ :body => "{\"first_name\":\"mickey\", \"type\":\"puggle\", \"last_name\"
98102
2019-12-01 14:25:53.385681000 +0300 maskme: {"message":"{ :body => \"{\\\"first_name\\\":\\\"mickey\\\", \\\"type\\\":\\\"puggle\\\", \\\"last_name\\\":\\\"the-dog\\\", \\\"password\\\":\\\"*******\\\"}\"}"}
99103
```
100104

101-
102-
### Run Unit Tests
105+
# Run Unit Tests
103106
```
104107
gem install bundler
105108
bundle install
106109
ruby -r ./test/*.rb
107110
```
111+
112+
# Special escaped json cases handling
113+

lib/fluent/plugin/filter_masking.rb

Lines changed: 22 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -30,14 +30,20 @@ def maskRecord(record)
3030
end
3131
begin
3232
recordStr = record.to_s
33+
34+
if @handleSpecialEscapedJsonCases == true
35+
@specialEscapedJsonRegexs.each do | regex, replace |
36+
recordStr = recordStr.gsub(regex, replace)
37+
end
38+
end
39+
3340
@fields_to_mask_regex.each do | fieldToMaskRegex, fieldToMaskRegexStringReplacement |
3441
if !(excludedFields.include? @fields_to_mask_keys[fieldToMaskRegex])
3542
recordStr = recordStr.gsub(fieldToMaskRegex, fieldToMaskRegexStringReplacement)
3643
end
3744
end
38-
45+
3946
maskedRecord = strToHash(recordStr)
40-
4147
rescue Exception => e
4248
$log.error "Failed to mask record: #{e}"
4349
end
@@ -51,13 +57,19 @@ def initialize
5157
@fields_to_mask_regex = {}
5258
@fields_to_mask_keys = {}
5359
@fieldsToExcludeJSONPathsArray = []
60+
61+
@handleSpecialEscapedJsonCases = false
62+
@specialEscapedJsonRegexs = {
63+
Regexp.new(/,(( *)(\\*)("*)( *)),/m) => "\1,"
64+
}
5465
end
5566

5667
# this method only called ones (on startup time)
5768
def configure(conf)
5869
super
5970
fieldsToMaskFilePath = conf['fieldsToMaskFilePath']
6071
fieldsToExcludeJSONPaths = conf['fieldsToExcludeJSONPaths']
72+
handleSpecialCases = conf['handleSpecialEscapedJsonCases']
6173

6274
if fieldsToExcludeJSONPaths != nil && fieldsToExcludeJSONPaths.size() > 0
6375
fieldsToExcludeJSONPaths.split(",").each do | field |
@@ -80,12 +92,12 @@ def configure(conf)
8092
if value.end_with? "/i"
8193
# case insensitive
8294
value = value.delete_suffix('/i')
83-
hashObjectRegex = Regexp.new(/(?::#{value}=>")(.*?)(?:")/mi)
84-
innerJSONStringRegex = Regexp.new(/(\\+)"#{value}\\+"\s*:\s*(\\+|\{).+?((?=(})|,( *|)(\s|\\+)\"(}*))|(?=}"$)|("}(?!\"|\\)))/mi)
95+
hashObjectRegex = Regexp.new(/(?::#{value}=>")(.*?)(?:")/mi) # mask element in hash object
96+
innerJSONStringRegex = Regexp.new(/(\\+)"#{value}\\+":\\+.+?((?=(})|,( *|)(\s|\\+)\")|(?=}"$))/mi) # mask element in json string using capture groups that count the level of escaping inside the json string
8597
else
8698
# case sensitive
87-
hashObjectRegex = Regexp.new(/(?::#{value}=>")(.*?)(?:")/m)
88-
innerJSONStringRegex = Regexp.new(/(\\+)"#{value}\\+"\s*:\s*(\\+|\{).+?((?=(})|,( *|)(\s|\\+)\"(}*))|(?=}"$)|("}(?!\"|\\)))/m)
99+
hashObjectRegex = Regexp.new(/(?::#{value}=>")(.*?)(?:")/m) # mask element in hash object
100+
innerJSONStringRegex = Regexp.new(/(\\+)"#{value}\\+":\\+.+?((?=(})|,( *|)(\s|\\+)\")|(?=}"$))/m) # mask element in json string using capture groups that count the level of escaping inside the json string
89101
end
90102

91103
@fields_to_mask.push(value)
@@ -100,6 +112,10 @@ def configure(conf)
100112
end
101113
end
102114

115+
# if true, each record (a json record), will be checked for a special escaped json cases
116+
# any found case will be 'gsub' with the right solution
117+
@handleSpecialEscapedJsonCases = handleSpecialCases != nil && handleSpecialCases.casecmp("true") == 0
118+
103119
puts "black list fields:"
104120
puts @fields_to_mask
105121
end

test/fields-to-mask

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,4 @@ last_name
44
street
55
number
66
password
7-
json_str_field
7+
cookie

test/test_filter_masking.rb

Lines changed: 23 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@
44
require "fluent/test/helpers"
55
require "./lib/fluent/plugin/filter_masking.rb"
66

7-
87
MASK_STRING = "*******"
98

109
class YourOwnFilterTest < Test::Unit::TestCase
@@ -28,6 +27,13 @@ def setup
2827
fieldsToMaskFilePath test/fields-to-mask-insensitive
2928
]
3029

30+
# configuration for special json escaped cases
31+
CONFIG_SPECIAL_CASES = %[
32+
fieldsToMaskFilePath test/fields-to-mask
33+
fieldsToExcludeJSONPaths excludedField,exclude.path.nestedExcludedField
34+
handleSpecialEscapedJsonCases true
35+
]
36+
3137
def create_driver(conf = CONFIG)
3238
Fluent::Test::Driver::Filter.new(Fluent::Plugin::MaskingFilter).configure(conf)
3339
end
@@ -102,6 +108,7 @@ def filter(config, messages)
102108
filtered_records = filter(conf, messages)
103109
assert_equal(expected, filtered_records)
104110
end
111+
105112
test 'mask field in hash object with exclude' do
106113
conf = CONFIG
107114
messages = [
@@ -113,6 +120,7 @@ def filter(config, messages)
113120
filtered_records = filter(conf, messages)
114121
assert_equal(expected, filtered_records)
115122
end
123+
116124
test 'mask field in hash object with nested exclude' do
117125
conf = CONFIG
118126
messages = [
@@ -149,37 +157,6 @@ def filter(config, messages)
149157
assert_equal(expected, filtered_records)
150158
end
151159

152-
test 'mask field which is inner json string field (should mask the whole object)' do
153-
conf = CONFIG
154-
messages = [
155-
{
156-
:body => {
157-
:action_name => "some_action",
158-
:action_type => "some type",
159-
:request => {
160-
:body_str => "{\"str_field\":\"mickey\",\"json_str_field\": {\"id\":\"ed8a8378-3235-4923-b802-7700167d1870\"},\"not_mask\":\"some_value\"}"
161-
}
162-
},
163-
:timestamp => "2020-06-08T16:00:57.341Z"
164-
}
165-
]
166-
167-
expected = [
168-
{
169-
:body => {
170-
:action_name => "some_action",
171-
:action_type => "some type",
172-
:request => {
173-
:body_str => "{\"str_field\":\"mickey\",\"json_str_field\":\"*******\",\"not_mask\":\"some_value\"}"
174-
}
175-
},
176-
:timestamp => "2020-06-08T16:00:57.341Z"
177-
}
178-
]
179-
180-
filtered_records = filter(conf, messages)
181-
assert_equal(expected, filtered_records)
182-
end
183160
end
184161

185162
sub_test_case 'plugin will mask all fields that need masking - case INSENSITIVE fields' do
@@ -223,7 +200,7 @@ def filter(config, messages)
223200
test 'mask case insensitive and case sensitive field in nested json escaped string' do
224201
conf = CONFIG_CASE_INSENSITIVE
225202
messages = [
226-
{ :body => "{\"firsT_naMe\":\"mickey\",\"last_name\":\"the-dog\",\"address\":\"{\\\"Street\":\\\"Austin\\\",\\\"number\":\\\"89\\\"}\", \"type\":\"puggle\"}" }
203+
{ :body => "{\"firsT_naMe\":\"mickey\",\"last_NAME\":\"the-dog\",\"address\":\"{\\\"Street\":\\\"Austin\\\",\\\"number\":\\\"89\\\"}\", \"type\":\"puggle\"}" }
227204
]
228205
expected = [
229206
{ :body => "{\"firsT_naMe\":\"mickey\",\"last_name\":\"*******\",\"address\":\"{\\\"street\\\":\\\"*******\\\",\\\"number\\\":\\\"*******\\\"}\", \"type\":\"puggle\"}" }
@@ -234,4 +211,17 @@ def filter(config, messages)
234211

235212
end
236213

214+
sub_test_case 'plugin will mask all fields that need masking - special json escaped cases' do
215+
test 'mask field in nested json escaped string when one of the values ends with "," (the value for "some_custom" field)' do
216+
conf = CONFIG_SPECIAL_CASES
217+
messages = [
218+
{ :body => "{\"first_name\":\"mickey\",\"last_name\":\"the-dog\",\"address\":\"{\\\"street\":\\\"Austin\\\",\\\"number\":\\\"89\\\"}\", \"type\":\"puggle\", \"cookie\":\"some_custom=,,live,default,,2097403972,2.22.242.38,\", \"city\":\"new york\"}" }
219+
]
220+
expected = [
221+
{ :body => "{\"first_name\":\"*******\",\"last_name\":\"*******\",\"address\":\"{\\\"street\\\":\\\"*******\\\",\\\"number\\\":\\\"*******\\\"}\", \"type\":\"puggle\", \"cookie\":\"*******\", \"city\":\"new york\"}" }
222+
]
223+
filtered_records = filter(conf, messages)
224+
assert_equal(expected, filtered_records)
225+
end
226+
end
237227
end

0 commit comments

Comments
 (0)