INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Niet
    -0.08
     rollback
    -0.07
     até
    -0.07
    ("'"
    -0.07
     Luk
    -0.06
    mqtt
    -0.06
    929
    -0.06
    emente
    -0.06
    kám
    -0.06
     Kant
    -0.06
    POSITIVE LOGITS
    Rib
    0.08
     rib
    0.08
    ib
    0.07
    	ns
    0.07
     Rib
    0.07
     gb
    0.07
    afb
    0.06
    ीब
    0.06
     nib
    0.06
    ubs
    0.06
    Act Density 0.018%

    No Known Activations