INDEX
    Explanations

    expressions of personal opinions and feelings about social interactions

    New Auto-Interp
    Negative Logits
    IB
    -0.15
    gart
    -0.14
    jes
    -0.14
    kö
    -0.14
    su
    -0.14
    jb
    -0.14
     Geb
    -0.14
    =""></
    -0.14
     Xxx
    -0.14
    ungi
    -0.13
    POSITIVE LOGITS
     would
    0.18
    would
    0.16
     Would
    0.16
    ureau
    0.16
    Would
    0.15
    cus
    0.15
    ave
    0.15
     wouldn
    0.15
    象
    0.14
    bbe
    0.14
    Act Density 0.138%

    No Known Activations