INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    𝕟
    1.13
        
    1.09
    1.05
                
    1.02
    1.01
     IBOutlet
    0.99
     Э
    0.98
     покры
    0.98
         
    0.97
    товой
    0.96
    POSITIVE LOGITS
    al
    1.53
    ي
    1.43
    y
    1.37
    f
    1.26
    yuk
    1.25
    stile
    1.22
    le
    1.21
    e
    1.21
    esinin
    1.19
    mary
    1.18
    Act Density 0.001%

    No Known Activations