INDEX
    Explanations

    The neuron activates on the word “metric” or “metrics,” identifying mentions of that term.

    New Auto-Interp
    Negative Logits
    -unit
    -0.07
    "a
    -0.07
     ashamed
    -0.07
     Noah
    -0.07
    aw
    -0.06
     straw
    -0.06
     Yaz
    -0.06
    alphabet
    -0.06
    άνα
    -0.06
    166
    -0.06
    POSITIVE LOGITS
     metrics
    0.11
    metrics
    0.10
     metric
    0.09
     Metric
    0.09
    _metric
    0.09
    Metrics
    0.09
     Metrics
    0.09
    Metric
    0.09
    mic
    0.08
    _metrics
    0.07
    Act Density 0.006%

    No Known Activations