INDEX
    Explanations

    expressions related to personal reflection and societal responsibilities

    New Auto-Interp
    Negative Logits
     probably
    -0.15
    enson
    -0.15
     real
    -0.15
     bon
    -0.15
     vera
    -0.15
     direct
    -0.14
     slightly
    -0.14
     rather
    -0.14
     Cot
    -0.14
    rna
    -0.14
    POSITIVE LOGITS
     anymore
    0.31
     nor
    0.23
     ANY
    0.20
     anybody
    0.18
    ä»»ä½ķ
    0.16
    nor
    0.15
    aeda
    0.15
     à¤ĩतन
    0.15
     nÃło
    0.15
    ίÏĦ
    0.15
    Act Density 0.178%

    No Known Activations