INDEX
    Explanations

    references to specific variables or symbols in a mathematical or scientific context

    New Auto-Interp
    Negative Logits
    Aya
    -0.79
    っきり
    -0.70
    ameter
    -0.65
     reç
    -0.64
    $​
    -0.64
    ㅤㅤ
    -0.62
    -0.62
     ló
    -0.61
    ••••
    -0.61
     Aya
    -0.61
    POSITIVE LOGITS
     Z
    1.10
     Zinn
    1.04
     z
    1.00
    Z
    0.99
     Zent
    0.99
     ZR
    0.98
     Zo
    0.98
     Zem
    0.96
     zó
    0.95
     ZO
    0.95
    Act Density 0.325%

    No Known Activations