INDEX
    Explanations

    phrases related to personal experience and opinion.

    New Auto-Interp
    Negative Logits
     Nora
    -0.56
     Uriel
    -0.53
     Dickinson
    -0.51
     DeVos
    -0.51
     goodbye
    -0.51
     abdom
    -0.50
     Gale
    -0.50
    heres
    -0.49
    edient
    -0.49
     Mats
    -0.47
    POSITIVE LOGITS
    'm
    1.01
    've
    0.88
     am
    0.77
    RL
    0.74
    pec
    0.73
     suppose
    0.73
     wish
    0.72
     myself
    0.72
    UC
    0.70
    stad
    0.70
    Act Density 11.214%

    No Known Activations