INDEX
    Explanations

    negations or expressions of reluctance and denial

    New Auto-Interp
    Negative Logits
    нÑıÑĤ
    -0.16
    ektor
    -0.16
    .wp
    -0.16
    auen
    -0.14
    chte
    -0.14
     Jeg
    -0.14
    newline
    -0.14
    ariat
    -0.14
    .tbl
    -0.14
    locker
    -0.14
    POSITIVE LOGITS
    busy
    0.14
    ạng
    0.14
    NG
    0.14
    lig
    0.14
    Jay
    0.14
     Fantasy
    0.14
     Corruption
    0.14
     Jay
    0.14
    lop
    0.14
    wn
    0.13
    Act Density 0.145%

    No Known Activations