INDEX
    Explanations

    specific categories of food and animals

    New Auto-Interp
    Negative Logits
     revers
    -0.15
    349
    -0.15
    857
    -0.14
    presso
    -0.14
     Cha
    -0.14
     Benson
    -0.14
    nde
    -0.14
    509
    -0.14
     const
    -0.14
    byn
    -0.14
    POSITIVE LOGITS
    ัà¸ģà¸Ĺ
    0.15
    æ¿
    0.14
    atters
    0.14
    esini
    0.14
    \helpers
    0.14
    alars
    0.14
    Neither
    0.14
     ngang
    0.13
    chwitz
    0.13
    ihan
    0.13
    Act Density 0.205%

    No Known Activations