INDEX
    Explanations

    The neuron fires on occurrences of religious‐style worship or praise verbs (e.g. “worship,” “glorification”).

    New Auto-Interp
    Negative Logits
     deltas
    -0.08
     Madd
    -0.07
     case
    -0.07
    Ein
    -0.07
     Finn
    -0.07
     eased
    -0.07
    -release
    -0.07
    (click
    -0.06
    dělen
    -0.06
    ۱۰
    -0.06
    POSITIVE LOGITS
     worship
    0.14
     Worship
    0.12
     worsh
    0.09
    hab
    0.08
    0.07
    hip
    0.07
    HIP
    0.07
    uner
    0.07
     země
    0.07
     reverence
    0.07
    Act Density 0.003%

    No Known Activations