INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     
    1.04
    í
    0.91
    ong
    0.80
    et
    0.77
    <0xBF>
    0.75
    0.75
    <0xAF>
    0.73
    ai
    0.72
    ens
    0.71
    ва
    0.70
    POSITIVE LOGITS
    Ско
    1.06
    К
    1.02
    0.89
    При
    0.85
    Он
    0.84
    У
    0.84
    Ра
    0.84
    Су
    0.84
    Α
    0.82
    То
    0.81
    Act Density 0.006%

    No Known Activations