INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -word
    -0.08
     dvěma
    -0.07
     Beyond
    -0.07
    (isolate
    -0.07
    ingular
    -0.06
    bursement
    -0.06
    -0.06
    (ls
    -0.06
    (col
    -0.06
    -0.06
    POSITIVE LOGITS
    ograf
    0.08
     трансп
    0.07
    /Resources
    0.06
    0.06
     ammon
    0.06
     Але
    0.06
     CHANNEL
    0.06
     محاس
    0.06
    -era
    0.06
     حمل
    0.06
    Act Density 0.099%

    No Known Activations