INDEX
    Explanations

    elements related to mathematical expressions and notation

    New Auto-Interp
    Negative Logits
    kem
    -0.15
    theon
    -0.14
    ecial
    -0.13
    pNet
    -0.13
    ÙIJر
    -0.13
    åĤĻ
    -0.13
    اÙĬات
    -0.13
    urette
    -0.13
    assen
    -0.12
     Orient
    -0.12
    POSITIVE LOGITS
    x
    0.41
     x
    0.41
    X
    0.37
    	x
    0.36
    _x
    0.32
    $x
    0.32
    (x
    0.31
     X
    0.31
    .x
    0.31
    -x
    0.30
    Act Density 0.155%

    No Known Activations