INDEX
    Explanations
    New Auto-Interp
    Negative Logits
     Brett
    -0.09
     rejected
    -0.07
    rejected
    -0.07
    LER
    -0.07
     rejection
    -0.07
     White
    -0.07
     dismissal
    -0.07
    Brandon
    -0.07
     honor
    -0.07
    zion
    -0.07
    POSITIVE LOGITS
    ..
    0.12
    ..
    0.12
    ..↵
    0.09
    ..↵↵
    0.08
    0.08
    ..↵
    0.08
    ../
    0.07
    )..
    0.07
    ..↵↵
    0.07
     Pipe
    0.07
    Act Density 0.017%

    No Known Activations