INDEX
    Explanations

    verbs expressing positive or negative evaluations of actions and statements

    New Auto-Interp
    Negative Logits
    liter
    -0.16
    ados
    -0.15
    bet
    -0.15
    andre
    -0.15
    á»IJ
    -0.14
    ASC
    -0.14
     ÐŁÐ¾Ðº
    -0.14
    ober
    -0.14
    irror
    -0.14
    licer
    -0.14
    POSITIVE LOGITS
    chio
    0.16
    ival
    0.16
    icle
    0.15
    ä¾
    0.15
    ics
    0.14
    igham
    0.13
    (pg
    0.13
    IVAL
    0.13
     pres
    0.13
    icles
    0.13
    Act Density 0.073%

    No Known Activations