INDEX
    Explanations

    phrases indicating emotional responses and personal experiences

    New Auto-Interp
    Negative Logits
     الرياضيه
    -0.73
    ukone
    -0.66
    othek
    -0.63
    はじめに
    -0.62
    новништво
    -0.62
     autorytatywna
    -0.61
    protoimpl
    -0.61
    yadh
    -0.60
    LANTA
    -0.59
    ADELPHIA
    -0.58
    POSITIVE LOGITS
    DebuggerNonUser
    0.70
     Jefus
    0.62
    ſel
    0.60
    ISupport
    0.59
     Inſ
    0.58
    めでとう
    0.57
    noopener
    0.57
     houſe
    0.56
     ſta
    0.54
    ffions
    0.54
    Act Density 1.496%

    No Known Activations