INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    UnsafeEnabled
    -0.60
    SequentialGroup
    -0.54
    PreferredItem
    -0.53
     ComVisible
    -0.50
    awtextra
    -0.50
     صوتيه
    -0.50
     PeEnEo
    -0.49
    setGeometry
    -0.49
     ***!
    -0.47
    twimg
    -0.47
    POSITIVE LOGITS
     to
    0.88
     fallu
    0.66
     voters
    0.65
     Canadians
    0.64
     fans
    0.63
    guenos
    0.62
     that
    0.60
     Kenyans
    0.58
     simplifié
    0.58
     listeners
    0.57
    Act Density 0.001%

    No Known Activations