INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Fellow
    -0.08
     therap
    -0.07
     Nur
    -0.07
     aperture
    -0.06
     Drum
    -0.06
     divergence
    -0.06
     Drill
    -0.06
     sampled
    -0.06
    _EXPRESSION
    -0.06
     lup
    -0.06
    POSITIVE LOGITS
    Static
    0.08
    atic
    0.08
    Si
    0.08
    asmine
    0.07
     static
    0.07
    STATIC
    0.07
    ática
    0.07
    sic
    0.07
    امت
    0.07
     Static
    0.07
    Act Density 0.014%

    No Known Activations