INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    zugeben
    -0.10
     Gig
    -0.08
    zhi
    -0.08
     झाली
    -0.08
    Cargo
    -0.08
     Nietzsche
    -0.07
     Titan
    -0.07
    zam
    -0.07
    killer
    -0.07
    holt
    -0.07
    POSITIVE LOGITS
     رنگ
    0.09
     walkway
    0.09
     рисун
    0.09
     ভিত
    0.09
    0.09
     રંગ
    0.09
    0.09
    -black
    0.09
     నమ
    0.08
    anta
    0.08
    Act Density 0.003%

    No Known Activations