INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     людина
    -0.07
    .fi
    -0.07
    kf
    -0.07
    .ap
    -0.07
    人が
    -0.07
     punching
    -0.07
    unchecked
    -0.07
     frogs
    -0.06
    LD
    -0.06
    اذ
    -0.06
    POSITIVE LOGITS
     Seam
    0.07
    (hex
    0.06
    .Record
    0.06
     Portions
    0.06
     RTWF
    0.06
     IS
    0.06
    (env
    0.06
     حساب
    0.06
     đào
    0.06
     Gluten
    0.06
    Act Density 0.098%

    No Known Activations