INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Fleisch
    -0.35
    Kaynakça
    -0.34
     empez
    -0.34
     یافته
    -0.34
     établi
    -0.33
     établie
    -0.33
    ycznego
    -0.32
     rodea
    -0.31
     mewujudkan
    -0.31
    ség
    -0.31
    POSITIVE LOGITS
    bot
    3.50
    Bot
    2.91
    BOT
    2.59
     bot
    2.47
     Bot
    2.47
     BOT
    2.13
    bots
    2.05
    Bots
    1.72
     bots
    1.70
    bott
    1.56
    Act Density 0.005%

    No Known Activations