INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     barn
    -0.07
     serge
    -0.07
     sofas
    -0.07
    .rollback
    -0.07
     WOM
    -0.07
     burglary
    -0.07
    قاد
    -0.06
     Builders
    -0.06
     الأجنب
    -0.06
    ="<?
    -0.06
    POSITIVE LOGITS
    ạn
    0.07
     disagree
    0.07
    -fin
    0.07
    --;
    0.07
    kos
    0.06
    ptrdiff
    0.06
    0.06
    0.06
    0.06
     Megan
    0.06
    Act Density 0.012%

    No Known Activations