INDEX
    Explanations

    expressions of emotional connections and personal relationships

    New Auto-Interp
    Negative Logits
    utas
    -0.16
     pivot
    -0.14
     whip
    -0.14
    ÑĨем
    -0.14
    ilon
    -0.14
    ÑĮми
    -0.13
    lobs
    -0.13
    etter
    -0.13
    ueva
    -0.13
    enta
    -0.13
    POSITIVE LOGITS
     showing
    0.47
     show
    0.44
     showed
    0.43
     shows
    0.42
     Showing
    0.40
     SHOW
    0.38
    show
    0.38
    Showing
    0.37
    -show
    0.33
     Show
    0.33
    Act Density 0.133%

    No Known Activations