INDEX
    Explanations

    phrases emphasizing high recognition or cultural significance

    New Auto-Interp
    Negative Logits
    isay
    -0.15
    trys
    -0.15
    cola
    -0.15
    apter
    -0.14
    oppins
    -0.14
    ults
    -0.14
    iferay
    -0.14
    oningen
    -0.14
    odesk
    -0.14
    okit
    -0.14
    POSITIVE LOGITS
     talked
    0.28
    -talk
    0.24
     loved
    0.22
     anticipated
    0.21
     visited
    0.20
     discussed
    0.20
     followed
    0.20
     well
    0.19
     buzz
    0.19
    -ce
    0.19
    Act Density 0.054%

    No Known Activations