INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    thenReturn
    -0.47
    "><
    -0.47
    ?<
    -0.46
    Michelle
    -0.46
    <
    -0.45
     Pic
    -0.45
     Designated
    -0.44
     <
    -0.43
     box
    -0.43
    Neil
    -0.43
    POSITIVE LOGITS
     slavery
    2.08
     Slavery
    1.88
    slavery
    1.70
     ensla
    0.96
     esclavos
    0.94
     escla
    0.91
     escra
    0.89
    Sla
    0.89
     HasFactory
    0.84
     enslaved
    0.73
    Act Density 0.004%

    No Known Activations