INDEX
    Explanations

    references to smiling or enjoyment

    New Auto-Interp
    Negative Logits
    amarin
    -0.17
    anager
    -0.15
    onn
    -0.15
    emer
    -0.14
    æ´
    -0.14
    .yy
    -0.14
    asje
    -0.14
    zcze
    -0.14
    inator
    -0.14
    inated
    -0.14
    POSITIVE LOGITS
     Sm
    0.30
     sm
    0.29
    aller
    0.26
    .Sm
    0.23
    (sm
    0.23
     smo
    0.22
    /sm
    0.21
    .SM
    0.21
    arth
    0.21
    .sm
    0.21
    Act Density 0.012%

    No Known Activations