INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     accidental
    -0.07
     zvlá
    -0.07
    Supplier
    -0.07
     wreck
    -0.06
    дия
    -0.06
     Especially
    -0.06
     coolant
    -0.06
     posture
    -0.06
     zoo
    -0.06
    (sorted
    -0.06
    POSITIVE LOGITS
    _face
    0.07
    0.07
    formatter
    0.06
     arkadaş
    0.06
     собой
    0.06
    _ghost
    0.06
    beam
    0.06
    -that
    0.06
    :url
    0.06
     Venez
    0.06
    Act Density 0.000%

    No Known Activations