INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ined
    -0.09
     raging
    -0.08
     rage
    -0.08
     excuses
    -0.08
     Korea
    -0.08
     Beast
    -0.08
    acios
    -0.08
    planation
    -0.08
     abstra
    -0.07
    ारित
    -0.07
    POSITIVE LOGITS
    _CAPTURE
    0.09
    Captured
    0.08
    -mid
    0.08
     collector
    0.08
     framebuffer
    0.07
     dobl
    0.07
     GENER
    0.07
    0.07
    George
    0.07
     ogl
    0.07
    Act Density 0.004%

    No Known Activations