INDEX
    Explanations
    New Auto-Interp
    Negative Logits
    _FUNCTION
    -0.07
     resorts
    -0.07
    -0.07
    -0.07
    -0.07
     qualified
    -0.07
     hol
    -0.07
     holistic
    -0.07
     horsepower
    -0.07
     balcon
    -0.07
    POSITIVE LOGITS
     goodbye
    0.12
    _exit
    0.11
     Interrupted
    0.11
     Goodbye
    0.11
     quitting
    0.11
     farewell
    0.11
    	exit
    0.11
     abrupt
    0.11
    Exit
    0.10
     interrupted
    0.10
    Act Density 0.004%

    No Known Activations