INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Monk
    -0.09
     festa
    -0.09
     eligibility
    -0.08
    ="./
    -0.08
    เทศ
    -0.07
    Eligibility
    -0.07
    ڻو
    -0.07
     وڃ
    -0.07
     ашиг
    -0.07
     Kirch
    -0.07
    POSITIVE LOGITS
     қызы
    0.08
    对此
    0.08
     воспри
    0.08
     polite
    0.07
     sing
    0.07
     treat
    0.07
     CERN
    0.07
     string
    0.07
    curl
    0.07
     wéi
    0.07
    Act Density 0.004%

    No Known Activations