INDEX
    Explanations

    general English language

    New Auto-Interp
    Negative Logits
    tom
    -0.08
    otope
    -0.06
    ....
    -0.06
    -0.06
    utation
    -0.06
    _adj
    -0.06
    ionate
    -0.06
     fik
    -0.06
    Px
    -0.06
     scl
    -0.06
    POSITIVE LOGITS
    aqu
    0.07
    СТ
    0.06
     lubric
    0.06
     Birleşik
    0.06
    /TR
    0.06
     [...
    0.06
    **↵
    0.06
    Fully
    0.06
     Hanson
    0.06
    lenir
    0.06
    Act Density 0.000%

    No Known Activations