INDEX
    Explanations

    statements of identity and self-description

    New Auto-Interp
    Negative Logits
    ok
    -0.17
    759
    -0.15
    åIJ§
    -0.15
     böylece
    -0.15
     therefore
    -0.15
    812
    -0.15
    .Dao
    -0.14
     however
    -0.14
    acha
    -0.14
     Either
    -0.14
    POSITIVE LOGITS
     only
    0.20
    åıªæĺ¯
    0.18
    only
    0.17
     далеко
    0.17
    Only
    0.16
    deaux
    0.15
     âīł
    0.15
     Only
    0.15
    /Peak
    0.15
    ’ÑıÑĤ
    0.15
    Act Density 0.234%

    No Known Activations