INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    zinski
    -0.72
    Ĥ¬
    -0.71
    cutting
    -0.67
    luaj
    -0.67
    ccording
    -0.66
    rongh
    -0.66
     Scouting
    -0.63
    âĶĢâĶĢ
    -0.60
     Nanto
    -0.60
    zar
    -0.60
    POSITIVE LOGITS
     questions
    1.17
     rhet
    1.08
     forgiveness
    1.05
    naires
    1.04
     probing
    1.00
     permission
    0.93
     plaint
    0.89
    ingly
    0.87
     politely
    0.86
     respondents
    0.85
    Act Density 0.047%

    No Known Activations