INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     mushroom
    -0.09
     gummy
    -0.09
     Bram
    -0.09
     slim
    -0.08
    tmpl
    -0.08
    sev
    -0.08
    weg
    -0.08
    ozy
    -0.08
    -Bahn
    -0.08
    pec
    -0.08
    POSITIVE LOGITS
    整数
    0.08
     Обычно
    0.08
     किलो
    0.08
     monot
    0.08
    ycles
    0.08
     представлены
    0.07
     pren
    0.07
     radians
    0.07
     lain
    0.07
    范围
    0.07
    Act Density 0.017%

    No Known Activations