INDEX
    Explanations

    the negation or sense of prohibition in statements

    New Auto-Interp
    Negative Logits
    .*")]
    -0.38
    layui
    -0.32
    twimg
    -0.31
    a
    -0.29
     sosial
    -0.27
     ↑
    -0.27
    Eloquent
    -0.27
     Mur
    -0.27
    aler
    -0.26
    مر
    -0.26
    POSITIVE LOGITS
     ſind
    0.75
     ſehen
    0.73
     ſche
    0.71
     Weiſe
    0.71
    <unused41>
    0.70
     unſer
    0.70
    <unused79>
    0.70
    <unused52>
    0.70
    <unused11>
    0.70
    <pad>
    0.69
    Act Density 0.000%

    No Known Activations