INDEX
    Explanations

    This neuron fires on the appearance of the word “prob,” i.e. it detects probability‐question prompts.

    New Auto-Interp
    Negative Logits
    PLE
    -0.07
    _SER
    -0.07
     UTC
    -0.06
     صاح
    -0.06
     Junction
    -0.06
    -0.06
    .getProperty
    -0.06
    .demo
    -0.06
    Fully
    -0.06
     sailors
    -0.06
    POSITIVE LOGITS
     мо
    0.06
     memiliki
    0.06
    /st
    0.06
    -prom
    0.06
    рії
    0.06
    .rc
    0.06
    基金
    0.06
     absent
    0.05
     exemptions
    0.05
     Complex
    0.05
    Act Density 0.001%

    No Known Activations