INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     fac
    -0.08
     Over
    -0.07
    Over
    -0.07
     lower
    -0.07
     Ow
    -0.07
    omat
    -0.07
     Trigger
    -0.07
    Trigger
    -0.07
     prov
    -0.07
     rotate
    -0.06
    POSITIVE LOGITS
     best
    0.13
    best
    0.13
    Best
    0.12
     Best
    0.11
     BEST
    0.10
    .best
    0.09
    BEST
    0.09
    _best
    0.09
    reatest
    0.07
    (best
    0.07
    Act Density 0.043%

    No Known Activations