INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    Suz
    -0.09
     helft
    -0.08
     plupart
    -0.08
    Nag
    -0.08
    ་�
    -0.08
    CUS
    -0.08
     Nag
    -0.08
     москов
    -0.08
     Gill
    -0.07
    UREMENT
    -0.07
    POSITIVE LOGITS
     coffee
    0.08
     objekt
    0.08
    0.08
     Kaffee
    0.08
     FORCE
    0.07
     campus
    0.07
     Coffee
    0.07
     alternativa
    0.07
     directives
    0.07
     thr
    0.07
    Act Density 0.002%

    No Known Activations