INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    eeper
    -0.10
     Bau
    -0.10
     tam
    -0.09
    anki
    -0.09
     Kre
    -0.09
     Vie
    -0.09
    ntag
    -0.08
     ï¾ĺ
    -0.08
     relax
    -0.08
    ctl
    -0.08
    POSITIVE LOGITS
     stand
    0.14
     against
    0.14
     ally
    0.13
    ÑĢоÑĤив
    0.13
     counter
    0.13
     den
    0.13
     Stand
    0.13
     condemn
    0.12
     anti
    0.12
    counter
    0.12
    Act Density 0.100%

    No Known Activations