INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ۵
    0.76
    roon
    0.68
     bizony
    0.67
     ۵
    0.65
    国产
    0.65
     certain
    0.65
    つの
    0.64
    ٥
    0.64
    0.63
     particulier
    0.63
    POSITIVE LOGITS
    0
    2.00
    1.49
    1.32
    𝟬
    1.31
    1.24
    1.20
    1.19
    1.11
    ٠
    1.09
    𝟎
    1.09
    Act Density 0.431%

    No Known Activations