INDEX
    Explanations

    math reasoning

    New Auto-Interp
    Negative Logits
     nutrition
    -0.08
     archaeology
    -0.07
    מוק
    -0.07
    get
    -0.07
    .nr
    -0.07
     residency
    -0.07
    ¡
    -0.07
    Duration
    -0.07
    ීම
    -0.07
    िंक
    -0.07
    POSITIVE LOGITS
     triv
    0.12
     trivial
    0.12
     Empty
    0.10
     kosong
    0.10
     zero
    0.10
     Zero
    0.10
    _ZERO
    0.10
     zéro
    0.10
     아무
    0.10
     صفر
    0.10
    Act Density 0.220%

    No Known Activations