(g)ULP!
Loading...
Searching...
No Matches
Architecture

Introduction

In the following document there's a high level overview of GULP's architecture.

GULP architecture

TLDR :)

flowchart LR
gulp[gULP
  main
  API server]

collab[(PostgreSQL
  collaboration DB)]

extension_plugin[ExtensionPlugins
  i.e. extend API,
    add functionality,
    whatever...]

data_plugin[IngestionPlugins
  parse events]

sigma_plugin[SigmaPlugins
  converts Sigma YAML
  to
  Lucene DSL
]

opensearch[(OpenSearch
  data)]

gulp <-->|users,
  sessions,
  operations,
  clients,
  glyph,
  notes,
  highlights,
  links,
  arbitrary shared data,
  stored queries,
  stats
  | collab

gulp <-->|startup| extension_plugin
gulp <-->|ingest| data_plugin
data_plugin<-.ingest.->opensearch
gulp <-->|query| sigma_plugin<-.query.->opensearch
gulp<-.query raw (DSL),
  query using simple filter.->opensearch

All components are based on the muty utility library

flowchart TB

muty[muty
  utilities & primitives lib]

muty<-.->gULP
muty<-.->ExtensionPlugins
muty<-.->IngestionPlugins
muty<-.->SigmaPlugins

Plugins

all plugins conforms to the PluginBase class.

doxygen documentation may be generated in the ./docs directory, to ease browsing of plugin-related source code. To generate simply run git submodule update --init --recursive && doxygen from inside the root of the cloned repository

  • ingestion plugins ingests events from different sources, i.e. windows evtx, apache CLF, ... creating documents on OpenSearch.
    • they must be in $PLUGINDIR and named gi_name.py/c.
    • documents are created mapping each event field to the ECS standard as close as possible, resulting in ECS-compliant data stored in the OpenSearch index.

      ‍ingestion plugins may be stacked in a chain, as in the chrome_history_sqlite_stacked plugin.

  • sigma plugins leverages pysigma pipelines to convert sigma rules to Lucene DSL queries mapping the original rule fields to ECS.
    • they must be in $PLUGINDIR/sigma and named gs_name.py/c.
    • to implement sigma plugins, look at the windows sigma plugin.
  • extension plugins are loaded at startup and may extend API and add functionality.
    • they must be in $PLUGINDIR/extension and named ge_name.py/c.

      ‍look at the extension_example to see how they work (documentation inside).

  • query plugins are used by the query_plugin API to support querying external sources (i.e. SIEM, ...)
    • they must be in $PLUGINDIR/query and named gq_name.py/c.

Mapping files

to customize mappings, plugins may use specifically crafted JSON files to be put in the ecs directory.

examples of such files and related API parameters may be found in the ecs folder and in plugin_internal_py.

for an extensive use of custom mapping, look at the csv plugin, which allows to ingest arbitrary CSV files with (and also without) mapping files.

API

Once you start gulp, the api is available via the openapi endpoint.

API flow

sequenceDiagram
  autonumber
  participant client
  participant gulp

  client->>gulp: login(user, password)
  gulp-->>client: token
  client->>gulp: some_api(token, ...)
  gulp-->>client: JSEND response

users must be created first with an ADMIN account

sequenceDiagram
  autonumber
  participant client
  participant gulp

  client->>gulp: login(admin, password)
  gulp-->>client: token
  client->>gulp: user_create(token, ...)
  gulp-->>client: JSEND response

for ingestion, ingestion client must be created by admin user

sequenceDiagram
  autonumber
  participant client
  participant gulp

  client->>gulp: login(admin, password)
  gulp-->>client: token
  client->>gulp: client_create(token, ...)
  gulp-->>client: JSEND response

Websocket

The endpoint /ws provides live feedback and results for ingestion, queries and collaboration objects via websocket:

Make sure you pass a ws_id value as string

sequenceDiagram

client->>server: {"token": "...", "ws_id": "...", "types": [...]}
server-->>client: { WsData }

Response from the websocket is a WsData object like the following:

{
// type can be one of the defined WsQueueDataType in gulp/api/rest/ws.py
// it will be checked agains websocket parameter "types", to send to the websocket only the types it is interested in (empty "types"=send all)
"type": 2,
// data contains the object (GulpStats or CollabObject) itself
"data": {
"type": 5,
"req_id": "099ff6b6-65fb-41aa-821f-d7fc9bd612c1",
"operation_id": 1,
"client_id": 1,
"context": "testcontext",
"status": 0,
"time_created": 1720705375978,
"time_expire": 1720791775978,
"time_update": 1720705447966,
"time_end": 0,
"ev_failed": 3,
"ev_skipped": 1,
"ev_processed": 66602,
"files_processed": 23,
"files_total": 24,
"ingest_errors": {
"/home/gulp/gulp/samples/win_evtx/sample_with_a_bad_chunk_magic.evtx": [
"[/home/gulp/gulp/src/gulp/plugins/win_evtx.py:ingest:349] IndexError: list index out of range\n"
],
"/home/gulp/gulp/samples/win_evtx/sysmon.evtx": [
"[/home/gulp/gulp/src/gulp/plugins/win_evtx.py:ingest:337] RuntimeError: Failed to parse chunk header\n"
],
"/home/gulp/gulp/samples/win_evtx/security.evtx": [
"[/home/gulp/gulp/src/gulp/plugins/win_evtx.py:record_to_gulp_document:138] File \"<string>\", line 33\n[/home/gulp/gulp/src/gulp/plugins/win_evtx.py:record_to_gulp_document:138] lxml.etree.XMLSyntaxError: PCDATA invalid Char value 3, line 33, column 33\n"
]
},
"current_src_file": "/home/gulp/gulp/samples/win_evtx/security_big_sample.evtx"
},
"req_id": "...",
// the caller username, if available
"username": null,
"timestamp": 1720705447975,
"ws_id": "def"
}