INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    keepers
    -0.08
    -0.08
     buying
    -0.07
     mjesta
    -0.07
    Schools
    -0.07
     Borges
    -0.07
    bundet
    -0.07
     आलो
    -0.07
     Jes
    -0.07
    Tout
    -0.07
    POSITIVE LOGITS
     malam
    0.09
     воздуха
    0.09
    ickname
    0.08
    HUD
    0.08
     airflow
    0.08
     atmosphere
    0.08
     berubah
    0.08
     hawa
    0.08
     beforehand
    0.08
    年月
    0.08
    Act Density 0.011%

    No Known Activations