INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     ming
    -0.08
    album
    -0.07
     Wallpaper
    -0.07
     Barbie
    -0.07
    Bal
    -0.06
    036
    -0.06
    گاهی
    -0.06
    ीं
    -0.06
     kosten
    -0.06
     pessim
    -0.06
    POSITIVE LOGITS
     Quentin
    0.07
     governing
    0.06
     Fighting
    0.06
    ASY
    0.06
    (scene
    0.06
     Christianity
    0.06
    0.06
     разв
    0.05
     dumpsters
    0.05
    생활
    0.05
    Act Density 0.016%

    No Known Activations