INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    �재
    -0.07
     Spider
    -0.07
    (policy
    -0.07
     Phật
    -0.06
     Yas
    -0.06
     شخصية
    -0.06
     Lei
    -0.06
    _checksum
    -0.06
    .to
    -0.06
    ouch
    -0.06
    POSITIVE LOGITS
     EVENTS
    0.06
    ######
    0.06
     ]↵
    0.06
     unanimously
    0.06
    erse
    0.06
    454
    0.06
    ")]↵↵
    0.06
    The
    0.06
     congreg
    0.06
     %↵
    0.06
    Act Density 0.006%

    No Known Activations