INDEX
    Explanations

    expressions of physical affection, particularly hugging

    New Auto-Interp
    Negative Logits
     kesi
    -0.61
     jaya
    -0.60
     saha
    -0.60
     istan
    -0.58
     maksi
    -0.56
     vider
    -0.56
     felipe
    -0.56
     rodrigo
    -0.54
     alberto
    -0.54
     kasa
    -0.54
    POSITIVE LOGITS
     hug
    0.66
    <bos>
    0.63
     hugged
    0.62
     hugs
    0.61
     hugging
    0.57
    queeze
    0.55
     comforting
    0.55
    hug
    0.52
     embrace
    0.51
    Hug
    0.50
    Act Density 0.193%

    No Known Activations