INDEX
    Explanations

    words and phrases related to rejection or prohibition

    New Auto-Interp
    Negative Logits
    .Reporting
    -0.18
    ÃŃc
    -0.14
     Into
    -0.14
    Ù쨵ÙĦ
    -0.14
    semb
    -0.14
    loub
    -0.14
    ازÙħ
    -0.14
    vil
    -0.13
    ya
    -0.13
    ei
    -0.13
    POSITIVE LOGITS
     altogether
    0.33
    æİī
    0.33
    /null
    0.28
     entirely
    0.24
    alto
    0.20
     completely
    0.19
     outright
    0.19
    issippi
    0.18
    ive
    0.18
    /block
    0.18
    Act Density 0.163%

    No Known Activations