INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     inkl
    -0.07
    -parse
    -0.07
    aims
    -0.06
     Copy
    -0.06
    274
    -0.06
    -0.06
    cosystem
    -0.06
    _stamp
    -0.06
    471
    -0.06
    -0.06
    POSITIVE LOGITS
     relax
    0.14
     relaxed
    0.14
     relaxing
    0.12
     relaxation
    0.11
     Relax
    0.10
    lux
    0.07
     lax
    0.07
    rix
    0.07
    าก
    0.06
     boring
    0.06
    Act Density 0.008%

    No Known Activations