INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    <bos>
    -1.63
    なかった
    -1.43
    -1.39
     blauw
    -1.34
     silicona
    -1.34
    AILED
    -1.34
    复制代码
    -1.33
     although
    -1.31
     Masih
    -1.31
    amaño
    -1.30
    POSITIVE LOGITS
     to
    2.47
     the
    1.95
    0
    1.64
    The
    1.57
     đến
    1.44
     Therefore
    1.39
     október
    1.38
    }$
    1.34
    什麼
    1.34
     The
    1.32
    Act Density 0.039%

    No Known Activations