INDEX
    Explanations

    negations and questions related to personal circumstances or opinions

    New Auto-Interp
    Negative Logits
     whose
    -0.17
    unday
    -0.16
     who
    -0.16
    ’Ãł
    -0.16
    ,
    -0.15
    äs
    -0.15
     sire
    -0.14
     Roe
    -0.14
    leftright
    -0.14
    inand
    -0.14
    POSITIVE LOGITS
     sav
    0.19
     pou
    0.17
     aur
    0.15
     compt
    0.15
     eskort
    0.15
    cle
    0.15
     mange
    0.15
     conn
    0.15
    ROID
    0.15
    gles
    0.15
    Act Density 0.014%

    No Known Activations