INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    -0.07
     disob
    -0.06
     pearls
    -0.06
     پرو
    -0.06
     bacon
    -0.06
     Tf
    -0.06
      		
    -0.06
    desired
    -0.06
    ip
    -0.06
    .backends
    -0.06
    POSITIVE LOGITS
     Quant
    0.12
     quant
    0.11
    Quant
    0.11
     quantitative
    0.10
    quant
    0.09
    _quant
    0.08
     мот
    0.07
    .quant
    0.07
     validated
    0.07
     Trent
    0.07
    Act Density 0.007%

    No Known Activations