INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     lst
    -0.07
    KIT
    -0.07
    .select
    -0.07
     dark
    -0.07
     lavor
    -0.07
     sided
    -0.06
    	int
    -0.06
     plunge
    -0.06
     thuisontvangst
    -0.06
    _px
    -0.06
    POSITIVE LOGITS
     Mercy
    0.07
    0.07
    �i
    0.06
     myriad
    0.06
     رفتار
    0.06
    0.06
     Muhammad
    0.06
     deploying
    0.06
     realidad
    0.06
    0.05
    Act Density 0.009%

    No Known Activations