INDEX
    Explanations

    Japanese characters with specific strokes and proportions

    specific non-English characters or symbols

    New Auto-Interp
    Negative Logits
    ESS
    -0.71
     Beir
    -0.71
     tort
    -0.66
     JPM
    -0.64
     hypers
    -0.63
     broker
    -0.63
     Claus
    -0.61
    arella
    -0.60
    afort
    -0.59
    ODY
    -0.58
    POSITIVE LOGITS
    nen
    1.08
    Åį
    1.07
    Å«
    1.06
    ··
    0.97
    nin
    0.96
    su
    0.92
    shi
    0.91
    Ê
    0.91
    ãĥ³ãĤ¸
    0.91
    Äģ
    0.89
    Act Density 0.008%

    No Known Activations