INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     bureaucracy
    -0.07
    ;r
    -0.07
    נושא
    -0.07
    (light
    -0.07
    rng
    -0.07
    出入
    -0.07
    _BLUE
    -0.07
     centerX
    -0.07
     dove
    -0.07
    ër
    -0.07
    POSITIVE LOGITS
     bày
    0.07
    esis
    0.07
    常委
    0.06
    หลาก
    0.06
     כפי
    0.06
    0.06
    0.06
     Paras
    0.06
    0.06
     Dmit
    0.06
    Act Density 0.126%

    No Known Activations