INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     flamingo
    0.42
    iosa
    0.39
    flamingo
    0.39
     argentinos
    0.39
    鹿
    0.38
    Deer
    0.38
     جاي
    0.37
    iciones
    0.37
    originally
    0.37
    NAMESPACE
    0.36
    POSITIVE LOGITS
    0.42
    启发
    0.40
     mics
    0.39
     Michal
    0.39
     prelude
    0.39
     mcc
    0.39
     Julia
    0.38
     Michał
    0.38
    <0x11>
    0.37
     bh
    0.37
    Act Density 0.000%

    No Known Activations