|
| 1 | +# Initial setup |
| 2 | + |
| 3 | +The Nozzle requires a client with the authorities `doppler.firehose` and `cloud_controller.admin_read_only` (the latter is only required if `ADD_APP_INFO` is enabled) and grant-types `client_credentials` and `refresh_token`. If `cloud_controller.admin_read_only` is not |
| 4 | +available in the system, switch to use `cloud_controller.admin`. |
| 5 | + |
| 6 | +You can either |
| 7 | +* Add the client manually using [uaac](https://github.com/cloudfoundry/cf-uaac) |
| 8 | +* Add the client to the deployment manifest; see [uaa.scim.users](https://github.com/cloudfoundry/uaa-release/blob/master/jobs/uaa/spec) |
| 9 | + |
| 10 | +Manifest example: |
| 11 | + |
| 12 | +```yaml |
| 13 | + |
| 14 | +# Clients |
| 15 | +uaa.clients: |
| 16 | + splunk-firehose: |
| 17 | + id: splunk-firehose |
| 18 | + override: true |
| 19 | + secret: splunk-firehose-secret |
| 20 | + authorized-grant-types: client_credentials,refresh_token |
| 21 | + authorities: doppler.firehose,cloud_controller.admin_read_only |
| 22 | +``` |
| 23 | +
|
| 24 | +`uaac` example: |
| 25 | +```shell |
| 26 | +uaac target https://uaa.[system domain url] |
| 27 | +uaac token client get admin -s [admin client credentials secret] |
| 28 | +uaac client add splunk-firehose --name splunk-firehose |
| 29 | +uaac client add splunk-firehose --secret [your_client_secret] |
| 30 | +uaac client add splunk-firehose --authorized_grant_types client_credentials,refresh_token |
| 31 | +uaac client add splunk-firehose --authorities doppler.firehose,cloud_controller.admin_read_only |
| 32 | +
|
| 33 | +``` |
| 34 | + |
| 35 | +`cloud_controller.admin_read_only` will work for cf v241 |
| 36 | +or later. Earlier versions should use `cloud_controller.admin` instead. |
| 37 | + |
| 38 | +### Push as an App to Cloud Foundry |
| 39 | + |
| 40 | +Push Splunk Firehose Nozzle as an application to Cloud Foundry. Please refer to **Setup** section for details |
| 41 | +on user authentication. |
| 42 | + |
| 43 | +1. Download the latest release |
| 44 | + |
| 45 | + ```shell |
| 46 | + git clone https://github.com/cloudfoundry-community/splunk-firehose-nozzle.git |
| 47 | + cd splunk-firehose-nozzle |
| 48 | + ``` |
| 49 | + |
| 50 | +2. Authenticate to Cloud Foundry |
| 51 | + |
| 52 | + ```shell |
| 53 | + cf login -a https://api.[your cf system domain] -u [your id] |
| 54 | + ``` |
| 55 | + |
| 56 | +3. Copy the manifest template and fill in needed values (using the credentials created during setup) |
| 57 | + |
| 58 | + ```shell |
| 59 | + vim scripts/ci_nozzle_manifest.yml |
| 60 | + ``` |
| 61 | + |
| 62 | +4. Push the nozzle |
| 63 | + |
| 64 | + ```shell |
| 65 | + make deploy-nozzle |
| 66 | + ``` |
| 67 | + |
| 68 | +#### Dump application info to boltdb |
| 69 | +If in production where there are lots of CF applications (say tens of thousands) and if the user would like to enrich |
| 70 | +application logs by including application metadata, querying all application metadata information from CF may take some time - |
| 71 | +for example if we include: add app name, space ID, space name, org ID and org name to the events. |
| 72 | +If there are multiple instances of Spunk nozzle deployed the situation will be even worse, since each of the Splunk nozzle(s) will query all applications meta data and |
| 73 | +cache the metadata information to the local boltdb file. These queries will introduce load to the CF system and could potentially take a long time to finish. |
| 74 | +Users can run this tool to generate a copy of all application metadata and copy this to each Splunk nozzle deployment. Each Splunk nozzle can pick up the cache copy and update the cache file incrementally afterwards. |
| 75 | + |
| 76 | +Example of how to run the dump application info tool: |
| 77 | + |
| 78 | +``` |
| 79 | +$ cd tools/dump_app_info |
| 80 | +$ go build dump_app_info.go |
| 81 | +$ ./dump_app_info --skip-ssl-validation --api-endpoint=https://<your api endpoint> --user=<api endpoint login username> --password=<api endpoint login password> |
| 82 | +``` |
| 83 | + |
| 84 | +After populating the application info cache file, user can copy to different Splunk nozzle deployments and start Splunk nozzle to pick up this cache file by |
| 85 | +specifying correct "--boltdb-path" flag or "BOLTDB_PATH" environment variable. |
| 86 | + |
| 87 | +### Disable logging for noisy applications |
| 88 | +Set `F2S_DISABLE_LOGGING` = true as a environment variable in applications's manifest to disable logging. |
| 89 | + |
| 90 | + |
| 91 | +## Index routing |
| 92 | +Index routing is a feature that can be used to send different Cloud Foundry logs to different indexes for better ACL and data retention control in Splunk. |
| 93 | + |
| 94 | +### Per application index routing via application manifest |
| 95 | +To enable per app index routing, |
| 96 | +* Please set environment variable `SPLUNK_INDEX` in your application's manifest ([example below](#example-manifest-file)) |
| 97 | +* Make sure Splunk nozzle is configured with `ADD_APP_INFO` (Select at least one of AppName,OrgName,OrgGuid,SpaceName,SpaceGuid) to enable app info caching |
| 98 | +* Make sure `SPLUNK_INDEX` specified in app's manifest exist in Splunk and can receive data for the configured Splunk HEC token. |
| 99 | + |
| 100 | +> **WARNING**: If `SPLUNK_INDEX` is invalid, events from other apps may also get lost as splunk will drop entire event batch if any of the event from batch is invalid (i.e. invalid index) |
| 101 | + |
| 102 | +There are two ways to set the variable: |
| 103 | + |
| 104 | +In your app manifest provide an environment variable called `SPLUNK_INDEX` and assign it the index you would like to send the app data to. |
| 105 | + |
| 106 | +#### Example Manifest file |
| 107 | +``` |
| 108 | +applications: |
| 109 | +- name: <App-Name> |
| 110 | + memory: 256M |
| 111 | + disk_quota: 256M |
| 112 | + ... |
| 113 | + env: |
| 114 | + SPLUNK_INDEX: <SPLUNK_INDEX> |
| 115 | + ... |
| 116 | +``` |
| 117 | +
|
| 118 | +You can also update the env on the fly using cf-cli command: |
| 119 | +``` |
| 120 | +cf set-env <APP_NAME> SPLUNK_INDEX <ENV_VAR_VALUE> |
| 121 | +``` |
| 122 | +#### Please note |
| 123 | +> If you are updating env on the fly, make sure that `APP_CACHE_INVALIDATE_TTL` is greater tha 0s. Otherwise cached app-info will not be updated and events will not be sent to required index. |
| 124 | +
|
| 125 | +
|
| 126 | +### Index routing via Splunk configuration |
| 127 | +Logs can be routed using fields such as app ID/name, space ID/name or org ID/name. |
| 128 | +Users can configure the Splunk configuration files props.conf and transforms.conf on Splunk indexers or Splunk Heavy Forwarders if deployed. |
| 129 | +
|
| 130 | +Below are few sample configuration: |
| 131 | +
|
| 132 | +1. Route data from application ID `95930b4e-c16c-478e-8ded-5c6e9c5981f8` to a Splunk `prod` index: |
| 133 | +
|
| 134 | +*$SPLUNK_HOME/etc/system/local/props.conf* |
| 135 | +``` |
| 136 | +[cf:logmessage] |
| 137 | +TRANSFORMS-index_routing = route_data_to_index_by_field_cf_app_id |
| 138 | +``` |
| 139 | +
|
| 140 | +
|
| 141 | +*$SPLUNK_HOME/etc/system/local/transforms.conf* |
| 142 | +``` |
| 143 | +[route_data_to_index_by_field_cf_app_id] |
| 144 | +REGEX = "(\w+)":"95930b4e-c16c-478e-8ded-5c6e9c5981f8" |
| 145 | +DEST_KEY = _MetaData:Index |
| 146 | +FORMAT = prod |
| 147 | +``` |
| 148 | +
|
| 149 | +
|
| 150 | +2. Routing application logs from any Cloud Foundry orgs whose names are prefixed with `sales` to a Splunk `sales` index. |
| 151 | +
|
| 152 | +*$SPLUNK_HOME/etc/system/local/props.conf* |
| 153 | +``` |
| 154 | +[cf:logmessage] |
| 155 | +TRANSFORMS-index_routing = route_data_to_index_by_field_cf_org_name |
| 156 | + |
| 157 | +``` |
| 158 | +
|
| 159 | +*$SPLUNK_HOME/etc/system/local/transforms.conf* |
| 160 | +``` |
| 161 | +[route_data_to_index_by_field_cf_org_name] |
| 162 | +REGEX = "cf_org_name":"(sales.*)" |
| 163 | +DEST_KEY = _MetaData:Index |
| 164 | +FORMAT = sales |
| 165 | +``` |
| 166 | +
|
| 167 | +3. Routing data from sourcetype `cf:splunknozzle` to index `new_index`: |
| 168 | +
|
| 169 | +*$SPLUNK_HOME/etc/system/local/props.conf* |
| 170 | +``` |
| 171 | +[cf:splunknozzle] |
| 172 | +TRANSFORMS-route_to_new_index = route_to_new_index |
| 173 | +``` |
| 174 | +
|
| 175 | +*$SPLUNK_HOME/etc/system/local/transforms.conf* |
| 176 | +``` |
| 177 | +[route_to_new_index] |
| 178 | +SOURCE_KEY = MetaData:Sourcetype |
| 179 | +DEST_KEY =_MetaData:Index |
| 180 | +REGEX = (sourcetype::cf:splunknozzle) |
| 181 | +FORMAT = new_index |
| 182 | +``` |
| 183 | +**Note:** Moving from version 1.2.4 to 1.2.5, timestamp will use nanosecond precision instead of milliseconds. |
| 184 | +
|
| 185 | +
|
| 186 | +## Monitoring (Metric data Ingestion): |
| 187 | +
|
| 188 | +| Metric Name | Description | |
| 189 | +|----------------------------------|-----------------------------------------------------------------------------| |
| 190 | +| `nozzle.queue.percentage` | Shows how much internal queue is filled | |
| 191 | +| `splunk.events.dropped.count` | Number of events dropped from splunk HEC | |
| 192 | +| `splunk.events.sent.count` | Number of events sent to splunk | |
| 193 | +| `firehose.events.dropped.count` | Number of events dropped from nozzle | |
| 194 | +| `firehose.events.received.count` | Number of events received from firehose(websocket) | |
| 195 | +| `splunk.events.throughput` | Average Payload size | |
| 196 | +| `nozzle.usage.ram` | RAM Usage | |
| 197 | +| `nozzle.usage.cpu` | CPU Usage | |
| 198 | +| `nozzle.cache.memory.hit` | How many times it has successfully retrieved the data from memory | |
| 199 | +| `nozzle.cache.memory.miss` | How many times it has unsuccessfully tried to retreive the data from memory | |
| 200 | +| `nozzle.cache.remote.hit` | How many times it has successfully retrieved the data from remote | |
| 201 | +| `nozzle.cache.remote.miss` | How many times it has unsuccessfully tried to retrieve the data from remote | |
| 202 | +| `nozzle.cache.boltdb.hit` | How many times it has successfully retrieved the data from BoltDB | |
| 203 | +| `nozzle.cache.boltdb.miss` | How many times it has unsuccessfully tried to retrieve the data from BoltDB | |
| 204 | +
|
| 205 | + |
| 206 | +
|
| 207 | + |
| 208 | +
|
| 209 | +**Note:** Select value Rate(Avg) for Aggregation from Analysis tab on the top right. |
| 210 | +
|
| 211 | +You can find a pre-made dashboard that can be used for monitoring in the `dashboards` directory. |
| 212 | +
|
| 213 | +### Routing data through edge processor via HEC |
| 214 | +Logs can be routed to Splunk via Edge Processor. Assuming that you have a working Edge Processor instance, you can use it with minimal |
| 215 | +changes to nozzle configuration. |
| 216 | +
|
| 217 | +Configuration fields that you should change are: |
| 218 | +* `SPLUNK_HOST`: Use the host of your Edge Processor instance instead of Splunk. Example: https://x.x.x.x:8088. |
| 219 | +* `SPLUNK_TOKEN`: It is a required parameter. A token used to authorize your request, can be found in Edge Processor settings. If your |
| 220 | + EP token authentication is turned off, you can enter a placeholder values instead (e.x. "-"). |
| 221 | +
|
| 222 | +
|
| 223 | +
|
0 commit comments