INDEX
    Explanations

    phrases indicating actions or opinions related to individuals or groups

    New Auto-Interp
    Negative Logits
    ulk
    -0.15
    ë´ī
    -0.14
    ignon
    -0.14
    пон
    -0.14
    iÅŁim
    -0.14
    atism
    -0.14
    ingleton
    -0.14
    uy
    -0.14
    æĪIJ人
    -0.14
     Jako
    -0.14
    POSITIVE LOGITS
    wait
    0.20
     wait
    0.20
     guarante
    0.19
     waited
    0.18
    .wait
    0.18
     ready
    0.17
     waiting
    0.17
     WAIT
    0.17
     guarantee
    0.17
     Wait
    0.17
    Act Density 0.008%

    No Known Activations