INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     سب
    -0.07
    але
    -0.07
     beast
    -0.07
     defe
    -0.07
    _qty
    -0.06
    -0.06
    EF
    -0.06
    .FILES
    -0.06
     isArray
    -0.06
     Beast
    -0.06
    POSITIVE LOGITS
     KNOW
    0.09
     aware
    0.08
     know
    0.08
     learns
    0.07
    Mom
    0.07
     endorsing
    0.07
    know
    0.06
     thinks
    0.06
    .bn
    0.06
     trä
    0.06
    Act Density 0.005%

    No Known Activations