INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     sergeant
    -0.06
     Prim
    -0.06
    venth
    -0.06
     Fourth
    -0.06
     Civ
    -0.06
    _si
    -0.06
    etre
    -0.06
     Petsc
    -0.06
     tweeted
    -0.06
     srpna
    -0.06
    POSITIVE LOGITS
     allow
    0.11
     allowing
    0.09
     Allow
    0.09
    allow
    0.09
     allows
    0.09
    Allow
    0.09
     allowed
    0.09
    Allows
    0.09
    -Allow
    0.08
    нолог
    0.08
    Act Density 0.043%

    No Known Activations