INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     antaranya
    -0.71
    antaranya
    -0.65
     réguli
    -0.64
     throughout
    -0.63
     numériques
    -0.60
     supérieurs
    -0.60
     tiroirs
    -0.59
     obicei
    -0.58
     humaines
    -0.58
     during
    -0.57
    POSITIVE LOGITS
     the
    1.02
     a
    1.00
     an
    0.81
    phazard
    0.71
    ]--;
    0.70
     his
    0.69
    ="{{$
    0.65
     any
    0.63
    }}^{(
    0.63
     their
    0.62
    Act Density 0.122%

    No Known Activations