INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    “This
    -0.09
    Questo
    -0.08
     attentes
    -0.08
     ماش
    -0.08
     iyadoo
    -0.08
     maş
    -0.08
     verwachtingen
    -0.07
     awe
    -0.07
    Testimonials
    -0.07
    Kills
    -0.07
    POSITIVE LOGITS
     Nik
    0.08
     hat
    0.08
    IRM
    0.08
     контракт
    0.08
    .imag
    0.07
     Ret
    0.07
     RET
    0.07
    928
    0.07
     Reality
    0.07
    126
    0.07
    Act Density 0.001%

    No Known Activations