INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    arers
    -0.71
    entimes
    -0.69
    ographies
    -0.67
    urers
    -0.67
    boxing
    -0.67
    ersive
    -0.67
    phasis
    -0.66
    uto
    -0.66
     nevertheless
    -0.65
    lees
    -0.65
    POSITIVE LOGITS
     RTX
    0.70
     theirs
    0.70
     Omni
    0.69
     Hurricane
    0.67
     Adventures
    0.66
     IEEE
    0.65
     Premiere
    0.65
     Flip
    0.64
     Ubuntu
    0.63
     Dungeons
    0.63
    Act Density 1.523%

    No Known Activations