INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     rato
    -0.08
     Fax
    -0.08
     одна
    -0.07
     Ata
    -0.07
     ^↵
    -0.07
     Ultimate
    -0.07
     <--
    -0.07
    -0.07
     entrega
    -0.07
    measure
    -0.07
    POSITIVE LOGITS
     Maa
    0.09
    0.08
    材料
    0.08
     illusions
    0.08
     refer
    0.08
    0.07
    IMP
    0.07
     Schwartz
    0.07
     Schn
    0.07
     Scha
    0.07
    Act Density 0.004%

    No Known Activations