INDEX
    Explanations

    names, identifiers, or prefixes

    New Auto-Interp
    Negative Logits
     (!)
    1.11
    ↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
    1.08
    ↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
    1.08
    1.07
    ↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
    1.04
    ↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
    1.03
    ↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
    1.02
    ↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵↵
    1.02
    ↵↵↵↵↵↵↵↵↵↵↵↵↵
    1.01
     (=
    1.01
    POSITIVE LOGITS
    _
    1.61
    Ch
    1.11
    St
    1.06
    Sh
    1.04
    __
    1.03
    W
    1.02
    J
    1.00
    Th
    0.96
    G
    0.94
    An
    0.93
    Act Density 0.008%

    No Known Activations