INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.08
    rama
    -0.08
     Fin
    -0.07
     known
    -0.07
     setups
    -0.06
     creature
    -0.06
    Know
    -0.06
     Run
    -0.06
     disdain
    -0.06
    UB
    -0.06
    POSITIVE LOGITS
     také
    0.06
    0.06
    aload
    0.06
     Dumbledore
    0.06
    ibrary
    0.06
     snad
    0.06
    ticket
    0.06
    .setProperty
    0.06
    adlo
    0.06
     سلام
    0.06
    Act Density 0.021%

    No Known Activations