INDEX
    Explanations

    markers of structured data or notation, such as mathematical symbols or references

    New Auto-Interp
    Negative Logits
    𝐮
    -0.80
     ruh
    -0.78
    𝐥
    -0.75
    ítmény
    -0.75
    redor
    -0.73
     riuscito
    -0.73
    pola
    -0.73
    bolt
    -0.72
    trib
    -0.71
     بيها
    -0.70
    POSITIVE LOGITS
    $,
    1.20
    }}$,
    1.04
    }$,
    1.03
    ]--;
    0.99
    \}$,
    0.98
    )}$,
    0.98
    $),
    0.97
    )$,
    0.92
    $).
    0.91
    ))$.
    0.90
    Act Density 0.388%

    No Known Activations