INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ok
    0.73
    。</
    0.56
    el
    0.55
    2
    0.51
    6
    0.49
    AB
    0.49
    ak
    0.49
    ij
    0.48
    af
    0.48
    0.47
    POSITIVE LOGITS
     moderne
    1.28
     moderno
    1.16
     현대
    1.13
     modernes
    1.11
     modernos
    1.07
     modernen
    1.02
    現代
    1.01
     moderna
    0.99
     modern
    0.98
     modernas
    0.98
    Act Density 0.035%

    No Known Activations