INDEX
    Explanations

    words related to performing well or achieving success

    phrases related to making good use or having fun

    New Auto-Interp
    Negative Logits
    hov
    -0.63
     cracked
    -0.60
    isma
    -0.57
    anwhile
    -0.57
     privately
    -0.56
     nailed
    -0.56
    hens
    -0.55
     approved
    -0.55
    ridor
    -0.54
     Niet
    -0.54
    POSITIVE LOGITS
    roads
    0.82
    ends
    0.80
    sense
    0.77
    fell
    0.74
     mockery
    0.72
     noises
    0.69
     landfall
    0.68
    URE
    0.67
    nell
    0.67
     sacrifices
    0.67
    Act Density 0.094%

    No Known Activations