INDEX
    Explanations

    This neuron responds to occurrences of the substring “in,” activating strongly on the standalone word “in” and on tokens beginning with “in-” (e.g. “input,” “internal”).

    New Auto-Interp
    Negative Logits
     meis
    -0.08
    .ut
    -0.07
    .orientation
    -0.06
    อด
    -0.06
    .LinearLayoutManager
    -0.06
    بینی
    -0.06
    ags
    -0.06
    .acc
    -0.06
    _WIN
    -0.06
    ΩΤ
    -0.06
    POSITIVE LOGITS
    γορ
    0.07
    ;'↵
    0.07
    že
    0.06
     external
    0.06
    ecute
    0.06
    freq
    0.06
     somewhat
    0.06
     Hair
    0.06
    ','%
    0.06
     disgusting
    0.06
    Act Density 0.019%

    No Known Activations