INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    cal
    -0.08
    original
    -0.07
    Prov
    -0.07
    _guard
    -0.07
     IDENT
    -0.07
     Top
    -0.07
    doll
    -0.06
     Fuck
    -0.06
     VOL
    -0.06
     Kol
    -0.06
    POSITIVE LOGITS
     mixed
    0.10
     Mixed
    0.10
    Mixed
    0.08
    -lived
    0.07
    mixed
    0.07
     applied
    0.06
    اهرة
    0.06
     смеш
    0.06
     inc
    0.06
    .TrimSpace
    0.06
    Act Density 0.004%

    No Known Activations