INDEX
    Explanations

    references to personal experiences and collective actions

    New Auto-Interp
    Negative Logits
    ccione
    -0.14
    _tF
    -0.14
    _mC
    -0.14
    ahy
    -0.14
    大人
    -0.13
    _tA
    -0.13
    igli
    -0.13
    _tE
    -0.13
    .Apis
    -0.13
    antt
    -0.13
    POSITIVE LOGITS
    idor
    0.19
     Wie
    0.16
    tempt
    0.14
    zano
    0.14
    landers
    0.14
    ijo
    0.14
    sing
    0.14
     kernel
    0.14
    semb
    0.14
     More
    0.14
    Act Density 0.170%

    No Known Activations