INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    	sub
    -0.07
    -0.07
     ridden
    -0.07
     naopak
    -0.07
     zelf
    -0.07
     indust
    -0.06
     Haram
    -0.06
     hesitation
    -0.06
    -0.06
    CONTROL
    -0.06
    POSITIVE LOGITS
    .Many
    0.07
     treasury
    0.06
    (username
    0.06
     Treasury
    0.06
    .admin
    0.06
    .↵↵↵↵↵
    0.06
    итуа
    0.06
    :X
    0.06
     tiện
    0.06
     advertiser
    0.06
    Act Density 0.003%

    No Known Activations