INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    ounter
    -0.75
    icably
    -0.74
     allowable
    -0.73
    ities
    -0.69
    etheless
    -0.69
     toler
    -0.68
     proble
    -0.67
    ITIES
    -0.67
    ictions
    -0.67
    ancies
    -0.67
    POSITIVE LOGITS
    love
    1.02
    Works
    1.01
     Squad
    0.98
    Maker
    0.98
     Breaker
    0.97
     Runner
    0.96
    Squ
    0.96
    breaker
    0.96
    Point
    0.95
     Girl
    0.94
    Act Density 1.713%

    No Known Activations