Skip to content

Commit

Permalink
* Major docs update (#573)
Browse files Browse the repository at this point in the history
* * Major docs update

* Update troubleshootig
* Add architectural considerations section
* Update BYOE docs
* Update runtime docs for `splunk_metadata.csv`
* Update "verify proper operation" in runtime docs

* Finish opening statement; grammar edits

* Finish opening statement; grammar edits

* Grammar

* Grammar

Co-authored-by: Ryan Faircloth <35384120+rfaircloth-splunk@users.noreply.github.com>
  • Loading branch information
2 people authored and GitHub committed Jul 24, 2020
1 parent 915da37 commit 0d60c6e
Show file tree
Hide file tree
Showing 11 changed files with 233 additions and 151 deletions.
46 changes: 46 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# SC4S Architectural Considerations

There are some key architectural considerations and recommendations that will yield extremely performant and reliable syslog
data collection while minimizing the "over-engineering" that is common in many syslog data collection designs. These
recommendatations are not specific to Splunk Connect for Syslog, but rather stem from the syslog protocol itself -- and its age.

## The syslog Protocol

The syslog protocol was designed in the mid 1980s to offer very high-speed, network-based logging for network and security devices that
were (especially at the time) starved for CPU and I/O resources. For this reason, the protocol was designed for speed and efficiency at the
expense of resiliencey/reliability. UDP was chosen due to its ability to "send and forget" the events over the network without regard
(or acknowledgment) of receipt. In later years, TCP was added as a transport, as well as TLS/SSL. In spite of these additions, UDP still
retains favor as a syslog transport for most data centers, and for the same reasons as originally designed.

Becuase of these tradeoffs selected by the original designers (and retained to this day), traditional methods used to provide scale and
resiliency do not necessarily transfer to the syslog world. We will discuss (and reference) some of the salient points below.

## Collector Location

Due to syslog being a "send and forget" protcol, it does not perform well when routed through substantial (and especially WAN) network infrastructure.
This _includes_ front-side load balancers. The most reliable way to collect syslog traffic is to provide for _edge_
(not centralized) collection. Resist the urge to centrally locate any syslog server (sc4s included) and expect the UDP and (stateless)
TCP traffic to "make it". Data loss will undoubtedly occur.

## syslog Data Collection at Scale

In concert with attempts to centralize syslog, many admins will co-locate several syslog-ng servers for horizontal scale, and load balance
to them with a front-side load balancer. For many reasons (that go beyond this short discussion) this is not a best practice. Briefly:

* The attempt to load balance for scale (and HA -- see below) will actually cause _more_ data loss due to normal device operations and
and attendant buffer loss than would be the case if a simple, robust single server (or shared-IP cluster) were used.

* Front-side load balancing will also cause inadequate data distribution on the upstream side, leading to data unevenness on the indexers.

## HA Considerations and Challenges

In addtion to scale, many opt to load balance for high availabilty. While a sound approach for stateful, application-level protocols such
as http, it does not work well for stateless, unacknowldged syslog traffic. Again, in the attempt to design for HA, more data ends up
being lost vs. more simple designs such as vMotioned VMs. With syslog, always remember that the protocol _itself_ is lossy, and there
_will_ be data loss (think CD-quality (lossless) vs. MP3). Syslog data collection can be made, at best, "Mostly Available".

## UDP vs. TCP

Paradoxically, UDP for syslog actually ends up being a better choice for resliency for syslog. For an excellent discussion on this topic
(as well as the "myth" of load balancers for HA),
see [Performant AND Reliable Syslog: UDP is best](https://www.rfaircloth.com/2020/05/21/performant-and-reliable-syslog-udp-is-best/).
42 changes: 20 additions & 22 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,34 +35,34 @@ for the alternate HEC destination `d_hec_FOO` to 24, set `SC4S_DEST_SPLUNK_HEC_F

## Creation of Additional Splunk HEC Destinations

Additional Splunk HEC destinations can be dynamically created through environment variables. The use of these destinations can then be controlled
along with other user-defined destinations on a global or per-source basis (see "Alternate Destination Use" immediately below).
Additional Splunk HEC destinations can be dynamically created through environment variables. When set, the destinations will be
created with the `DESTID` appended, for example: `d_hec_FOO`. These destinations can then be specified for use (along with any other
destinations created locally) either globally or per source. See the "Alternate Destination Use" in the next section for details.

| Variable | Values | Description |
|----------|---------------|-------------|
| SPLUNK_HEC_ALT_DESTS | Comma or space-separated UPPER case list of destination ids | destination IDs are UPPER case single-word friendly strings used to identify the new destination, which will be named with the destination id appended, for example `d_hec_FOO` |
| SPLUNK_HEC&lt;DESTID&gt;_URL | url | Example: `SPLUNK_HEC_FOO_URL=https://splunk:8088`. `DESTID` must be a member of the list configured in `SPLUNK_HEC_ALT_DESTS` configured above |
| SPLUNK_HEC&lt;DESTID&gt;_TOKEN | string | Example: `SPLUNK_HEC_BAR_TOKEN=&lt;token&gt;`. `DESTID` must be a member of the list configured in `SPLUNK_HEC_ALT_DESTS` configured above |
| SPLUNK_HEC_ALT_DESTS | Comma or space-separated UPPER case list of destination IDs | Destination IDs are UPPER case, single-word friendly strings used to identify the new destinations which will be named with the `DESTID` appended, for example `d_hec_FOO` |
| SPLUNK_HEC_&lt;DESTID&gt;_URL | url | Example: `SPLUNK_HEC_FOO_URL=https://splunk:8088` `DESTID` must be a member of the list specified in `SPLUNK_HEC_ALT_DESTS` configured above |
| SPLUNK_HEC_&lt;DESTID&gt;_TOKEN | string | Example: `SPLUNK_HEC_BAR_TOKEN=<token>` `DESTID` must be a member of the list specified in `SPLUNK_HEC_ALT_DESTS` configured above |

When set above, the destinations will be created with the `DESTID` appended, for example: `d_hec_FOO`. These destinations can then be
specified below (along with any other destinations created locally) either globally or per source.

* NOTE: The `DESTID` specified in the `URL` and `TOKEN` variables above _must_ match the `DESTID` entries enumerated the `SPLUNK_HEC_ALT_DESTS` list.
Failure to do so will cause destinations to be created without proper HEC parameters.
* NOTE: The `DESTID` specified in the `URL` and `TOKEN` variables above _must_ match the `DESTID` entries enumerated in the
`SPLUNK_HEC_ALT_DESTS` list. For each `DESTID` value specified in `SPLUNK_HEC_ALT_DESTS` there must be a corresponding `URL` and `TOKEN`
variable set as well. Failure to do so will cause destinations to be created without proper HEC parameters which will result in connection
failure.

* NOTE: Additional Splunk HEC destinations will _not_ be tested at startup. It is the responsiblity of the admin to ensure that additional destinations
are provisioned with the correct URL(s) and tokens to ensure proper connectivity.

* NOTE: The disk and CPU requirements will increase proportionally depending on the number of additional HEC destinations in use (e.g. each HEC
destination will have its own disk buffer).
destination will have its own disk buffer by default).

## Alternate Destination Use

All alternate destinations (including alternate HEC destinations) are configured for use in SC4S through variables. Global and/or source-specific forms of
the variables below can be used to send data to alternate destinations.
All alternate destinations (including alternate HEC destinations) are configured for use in SC4S through the variables below. Global and/or
source-specific forms of the variables below can be used to send data to additional and/or alternate destinations.

* NOTE: The administrator is responsible for ensuring that any non-HEC alternate destinations are configured in the
local mount tree, and that syslog-ng properly parses them.
local mount tree, and that the underlying syslog-ng process in sc4s properly parses them.

* NOTE: Do not include the primary HEC destination (`d_hec`) in any list of alternate destinations. The configuration of the primary HEC destination
is configured separately from that of the alternates below. However, _alternate_ HEC destinations (e.g. `d_hec_FOO`) should be configured below, just
Expand Down Expand Up @@ -192,10 +192,9 @@ Here is a snippet from the `splunk_metadata.csv` file:
juniper_netscreen,index,ns_index
```

The columns in this file are `key`, `metadata`, and `value`. By default, the keys in this file are "commented out", but in reality CSV files
cannot have comments so the `#` simply causes a mismatch to the key reference, effectively "commenting" it out. Therefore, to ensure there
is a match from the log path that references this file, be sure to remove the leading `#`. Once this is done, the following changes can be
made by modifying and/or adding rows in the table and specifying one or more of the following `metadata`/`value` pairs for a given `key`:
The columns in this file are `key`, `metadata`, and `value`. Defaults are populated into this file at initial startup, and any changes
made will be preserved on subsequent startups. Changes can be made by modifying and/or adding rows in the table and specifying one or more
of the following `metadata`/`value` pairs for a given `key`:

* `index` to specify an alternate `value` for index
* `source` to specify an alternate `value` for source
Expand All @@ -206,15 +205,14 @@ made by modifying and/or adding rows in the table and specifying one or more of
indexed by Splunk. Changing this carries the same warning as the sourcetype above; this will affect the upstream TA. The template
choices are documented elsewhere in this "Configuration" section.

In this case, the `juniper_netscreen` key is "uncommented" (thereby enabling it), and the new index used for that data source will be
`ns_index`.
In this case, the `juniper_netscreen` key references a new index used for that data source called `ns_index`.

In general, for most deployments the index should be the only change needed; other default metadata should almost
never be overridden (particularly for the "Out of the Box" data sources). Even then, care should be taken when considering any alternates,
as the defaults for SC4S were chosen with best practices in mind.

The `splunk_metadata.csv` file should also be appended to (with a "commented out" default for the index) when building custom SC4S log paths
(filters). Care should be taken during filter design to choose appropriate index, sourctype and template defaults, so that admins are not
The `splunk_metadata.csv` file should also be appended to with an appropriate default for the index when building a custom SC4S log path
(filter). Care should be taken during filter design to choose appropriate index, sourctype and template defaults, so that admins are not
compelled to override them.


Expand Down
36 changes: 14 additions & 22 deletions docs/gettingstarted/byoe-rhel7.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,12 +47,13 @@ sudo yum install ./epel-release-latest-*.noarch.rpm -y
sudo subscription-manager repos --enable rhel-7-server-optional-rpms
```

* Enable the "stable" unofficial repo for syslog-ng and install required packages
* Enable the "stable" unofficial repo for syslog-ng and install required packages. The last package, `syslog-ng-afsnmp`, is only required
when using the optional snmp trap collection facility (disabled by default).

```bash
cd /etc/yum.repos.d/
sudo wget https://copr.fedorainfracloud.org/coprs/czanik/syslog-ng-stable/repo/epel-7/czanik-syslog-ng-stable-epel-7.repo
sudo yum install syslog-ng syslog-ng-http syslog-ng-python
sudo yum install syslog-ng syslog-ng-http syslog-ng-python syslog-ng-afsnmp
```

* Optional step: Disable the distro-supplied syslog-ng unit file, as the syslog-ng process configured here will run as the `sc4s`
Expand All @@ -64,14 +65,17 @@ sudo systemctl stop syslog-ng
sudo systemctl disable syslog-ng
```

* Download the latest bare_metal.tar from [releases](https://github.com/splunk/splunk-connect-for-syslog/releases) on github and untar the package in `/etc/syslog-ng`
* Download the latest bare_metal.tar from [releases](https://github.com/splunk/splunk-connect-for-syslog/releases) on github and untar the package in `/etc/syslog-ng` using the command example below.

* NOTE: The `wget` process below will unpack a tarball with the sc4s version of the syslog-ng config files in the standard
`/etc/syslog-ng` location, and _will_ overwrite existing content. Ensure that any previous configurations of syslog-ng are saved
if needed prior to executing the download step.

* NOTE: At the time of writing, the latest release is `v1.24.0`. The latest release is typically listed first on the page above, unless
there is an `-alpha`,`-beta`, or `-rc` release that is newer (which will be clearly indicated). For production use, select the latest that does not have an `-rc`, `-alpha`, or `-beta` suffix.

```bash
sudo wget -c https://github.com/splunk/splunk-connect-for-syslog/releases/download/latest/baremetal.tar -O - | sudo tar -x -C /etc/syslog-ng
sudo wget -c https://github.com/splunk/splunk-connect-for-syslog/releases/download/<latest release>/baremetal.tar -O - | sudo tar -x -C /etc/syslog-ng
```

* Install gomplate and confirm that the version is 3.5.0 or newer
Expand All @@ -82,10 +86,6 @@ sudo chmod 755 /usr/local/bin/gomplate
gomplate --version
```

* Install the latest python

```scl enable rh-python36 bash```

* create the sc4s unit file ``/lib/systemd/system/sc4s.service`` and add the following content

```ini
Expand Down Expand Up @@ -122,7 +122,6 @@ Add the following content (but be sure to check the note above to ensure the lat

```bash
#!/usr/bin/env bash
source scl_source enable rh-python36

cd /etc/syslog-ng
#The following is no longer needed but retained as a comment just in case we run into command line length issues
Expand All @@ -136,7 +135,8 @@ cd /etc/syslog-ng
# --output-map="$d/{{ .in | strings.ReplaceAll \".conf.tmpl\" \".conf\" }}"
#done

gomplate $(find . -name *.tmpl | sed -E 's/^(\/.*\/)*(.*)\..*$/--file=\2.tmpl --out=\2/') --template t=go_templates/
# Ensure gomplate is in the shell path or provide the full pathname to the executable
/usr/local/bin/gomplate $(find . -name "*.tmpl" | sed -E 's/^(\/.*\/)*(.*)\..*$/--file=\2.tmpl --out=\2/') --template t=go_templates/

mkdir -p /etc/syslog-ng/conf.d/local/context/
mkdir -p /etc/syslog-ng/conf.d/local/config/
Expand All @@ -145,9 +145,9 @@ for file in /etc/syslog-ng/conf.d/local/context/*.example ; do cp -v -n $file ${
cp -v -R /etc/syslog-ng/local_config/* /etc/syslog-ng/conf.d/local/config/
```

* (Optional) Execute the preconfiguration shell script created above. You may also optionally execute it as part of the unit
file, which is recommended. If you elect _not_ to execute the script in the unit file, care must be taken to execute it manually "out of band"
when any changes are made.
* Execute the preconfiguration shell script created above prior to starting sc4s. You may also optionally execute it as part of a systemd unit
file (as shown above), which is recommended. If you elect _not_ to execute the script as part of systemd, care must be taken to execute it
manually "out of band" when any changes are made.

```bash
sudo bash /opt/sc4s/bin/preconfig.sh
Expand Down Expand Up @@ -187,12 +187,4 @@ the data. In other cases, a unique listening port is required for certain devic
For collection of such sources we provide a means of dedicating a unique listening port to a specific source.

Refer to the "Sources" documentation to identify the specific environment variables used to enable unique listening ports for the technology
in use.

## Unique Ports for Device "Families"

Certain technology "families", such as CEF and Fortinet, are handled by a single log path in SC4S. To set unique ports for individual
devices in a family (e.g. one each for Fortiweb and FortiOS), the container version of SC4S uses "container networking" (detailed
in the source document for the respective device families). This, of course, is not avaialble in BYOE. For this reason, the syslog-ng source
configuration for the extra ports that need to be mapped will need to be added manually to either the template or final "conf" version of the
respective log path file.
in use.
Loading

0 comments on commit 0d60c6e

Please sign in to comment.