SPL入門: Splunk検索の基本から実践まで

Splunkは世界で最も広く使われているログ分析プラットフォームです。その心臓部にあるのが**SPL（Search Processing Language）**です。SPLを使いこなせるかどうかで、Splunkから得られる価値は大きく変わります。

この記事では、SPLの基本から実践的なクエリまで、具体例とともに解説します。

SPLとは

SPL（Search Processing Language）は、Splunkがインデックス化したデータを検索・分析するための言語です。UnixのパイプラインとSQLの概念を組み合わせた設計になっており、直感的にデータを絞り込み、変換、集計できます。

flowchart LR
    subgraph Pipeline["SPLパイプライン"]
        A["検索"] --> B["フィルタ"]
        B --> C["変換"]
        C --> D["集計"]
        D --> E["表示"]
    end
    style Pipeline fill:#3b82f6,color:#fff

SPLの特徴は以下の通りです：

パイプライン処理: コマンドを | で繋いでデータを順次処理
時系列最適化: タイムスタンプベースの検索が高速
140以上のコマンド: 統計、変換、可視化まで幅広くカバー

基本構文

検索の基本形

SPLクエリは「検索語」から始まり、パイプ（|）でコマンドを連結します。

index=web_logs status=500 | stats count by uri | sort -count | head 10

このクエリは以下を実行します：

web_logsインデックスからstatus=500のイベントを検索
uriごとにカウントを集計
カウントの降順でソート
上位10件を表示

検索語（Search Terms）

検索の最初の部分では、対象データを絞り込みます。

# インデックス指定
index=main

# キーワード検索
error OR failed

# フィールド値の指定
status=404
host="web-server-01"

# ワイルドカード
source="/var/log/*.log"

# 否定
NOT status=200

# 時間範囲（相対）
earliest=-24h latest=now

重要: index=*は全インデックスを検索するため非常に遅くなります。常に具体的なインデックスを指定しましょう。

時間範囲の指定

Splunkは時系列データに最適化されているため、時間範囲の指定は検索パフォーマンスに大きく影響します。

# 相対時間
earliest=-1h          # 1時間前から
earliest=-7d@d        # 7日前の0時から
earliest=@d           # 今日の0時から

# 絶対時間
earliest="2026-02-01:00:00:00"
latest="2026-02-05:23:59:59"

# スナップ演算子（@）
earliest=-1d@d        # 昨日の0時
earliest=-1w@w        # 先週の始まり

必須コマンド5選

1. stats - 統計集計

statsはSPLで最も重要なコマンドです。データをグループ化して統計を計算します。

# 基本的なカウント
index=web_logs | stats count

# フィールドでグループ化
index=web_logs | stats count by status

# 複数の統計関数
index=web_logs
| stats count, avg(response_time) as avg_time, max(response_time) as max_time by uri

# よく使う統計関数
# count    - イベント数
# sum      - 合計
# avg      - 平均
# min/max  - 最小/最大
# dc       - ユニーク数（distinct count）
# values   - ユニーク値のリスト
# latest   - 最新の値

2. eval - フィールド計算

evalは新しいフィールドを作成したり、既存フィールドを変換します。

# 新しいフィールドの作成
index=web_logs
| eval response_sec = response_time / 1000

# 条件分岐
index=web_logs
| eval status_category = case(
    status < 300, "success",
    status < 400, "redirect",
    status < 500, "client_error",
    true(), "server_error"
)

# 文字列操作
index=web_logs
| eval domain = lower(host)
| eval short_uri = substr(uri, 1, 50)

# 日時操作
index=web_logs
| eval hour = strftime(_time, "%H")
| eval day_of_week = strftime(_time, "%A")

evalの便利な関数:

関数	説明	例
`if(条件, 真, 偽)`	条件分岐	`if(status=200, "OK", "Error")`
`case(条件1, 値1, ...)`	複数条件分岐	上記参照
`coalesce(a, b, ...)`	最初の非null値	`coalesce(user, "anonymous")`
`len(str)`	文字列長	`len(message)`
`replace(str, regex, new)`	置換	`replace(uri, "\d+", "N")`
`mvcount(field)`	多値フィールドの要素数	`mvcount(tags)`

3. timechart - 時系列グラフ

timechartは時間軸でデータを集計し、グラフ表示に適した形式で出力します。

# 時間ごとのイベント数
index=web_logs | timechart count

# 1時間ごとのステータスコード別カウント
index=web_logs | timechart span=1h count by status

# 平均レスポンスタイム（5分間隔）
index=web_logs | timechart span=5m avg(response_time) as avg_response

# 複数メトリクスの同時表示
index=web_logs
| timechart span=1h count as requests, avg(response_time) as avg_time

flowchart TB
    subgraph timechart["timechartの動作"]
        A["生ログ"] --> B["時間バケットに分割"]
        B --> C["バケットごとに集計"]
        C --> D["時系列テーブル出力"]
    end
    style timechart fill:#8b5cf6,color:#fff

4. table / fields - 出力フィールドの制御

tableは指定したフィールドだけをテーブル形式で表示します。

# 特定フィールドのみ表示
index=web_logs
| table _time, host, uri, status, response_time

# fieldsで不要なフィールドを除外
index=web_logs
| fields - _raw, _cd, _indextime

# renameでフィールド名を変更
index=web_logs
| rename response_time as "Response Time (ms)", status as "HTTP Status"
| table _time, host, uri, "HTTP Status", "Response Time (ms)"

5. where / search - フィルタリング

パイプライン途中でデータをフィルタリングします。

# whereは式評価（計算フィールドにも使える）
index=web_logs
| eval response_sec = response_time / 1000
| where response_sec > 5

# searchはキーワード検索
index=web_logs
| stats count by uri, status
| search status=500

# whereの比較演算子
| where response_time > 1000
| where status >= 400 AND status < 500
| where like(uri, "/api/%")
| where match(user_agent, "(?i)bot")

実践例

例1: エラー分析ダッシュボード

Webサーバーのエラーを分析するクエリセット。

# HTTPステータスコードの分布
index=web_logs earliest=-24h
| eval status_group = case(
    status < 300, "2xx Success",
    status < 400, "3xx Redirect",
    status < 500, "4xx Client Error",
    true(), "5xx Server Error"
)
| stats count by status_group
| sort status_group

# エラーが多いエンドポイントTop10
index=web_logs status>=400 earliest=-24h
| stats count as errors by uri
| sort -errors
| head 10

# 時間帯別エラー率
index=web_logs earliest=-24h
| timechart span=1h
    count(eval(status>=400)) as errors,
    count as total
| eval error_rate = round(errors / total * 100, 2)
| fields _time, errors, total, error_rate

例2: パフォーマンス分析

レスポンスタイムの分析。

# パーセンタイル分析
index=web_logs earliest=-1h
| stats
    avg(response_time) as avg,
    median(response_time) as p50,
    perc95(response_time) as p95,
    perc99(response_time) as p99,
    max(response_time) as max
| eval avg = round(avg, 2)

# 遅いリクエストの特定
index=web_logs earliest=-1h
| where response_time > 3000
| table _time, host, uri, response_time, status
| sort -response_time

例3: ユーザー行動分析

# アクティブユーザー数（日次）
index=web_logs earliest=-7d
| timechart span=1d dc(user_id) as unique_users

# ユーザーごとのセッション分析
index=web_logs user_id=* earliest=-24h
| stats
    count as page_views,
    dc(uri) as unique_pages,
    min(_time) as first_access,
    max(_time) as last_access
    by user_id
| eval session_duration = last_access - first_access
| eval session_minutes = round(session_duration / 60, 1)
| table user_id, page_views, unique_pages, session_minutes
| sort -page_views

パフォーマンスのベストプラクティス

SPLクエリを最適化するための重要なポイントです。

1. 時間範囲を絞る

# 悪い例: 全期間検索
index=web_logs status=500

# 良い例: 時間範囲を指定
index=web_logs status=500 earliest=-24h latest=now

2. できるだけ早くフィルタリング

# 悪い例: 集計後にフィルタ
index=web_logs | stats count by status | search status=500

# 良い例: 検索時にフィルタ
index=web_logs status=500 | stats count

3. fieldsで不要なフィールドを除外

# 大量のフィールドを持つログの場合
index=web_logs earliest=-1h
| fields _time, host, uri, status, response_time
| stats avg(response_time) by host

4. statsはtransactionより高速

# 遅い: transaction
index=web_logs | transaction session_id | stats count

# 速い: stats
index=web_logs | stats count, values(uri) as pages by session_id | stats count

まとめ

コマンド	用途	例
`stats`	統計集計	`stats count, avg(field) by group`
`eval`	フィールド計算	`eval new_field = field1 + field2`
`timechart`	時系列集計	`timechart span=1h count by status`
`table`	フィールド選択	`table _time, host, status`
`where`	条件フィルタ	`where response_time > 1000`
`sort`	ソート	`sort -count` (降順)
`head/tail`	件数制限	`head 10`
`rename`	フィールド名変更	`rename field as "新しい名前"`
`dedup`	重複除去	`dedup host, uri`

SPLは奥が深く、140以上のコマンドがありますが、この記事で紹介した5つのコマンド（stats, eval, timechart, table, where）を使いこなせれば、ほとんどのユースケースに対応できます。