INDEX
    Explanations

    elements of community involvement and personal authenticity

    New Auto-Interp
    Negative Logits
     unlike
    -0.17
     btw
    -0.15
    contr
    -0.14
    ffen
    -0.14
    ¹
    -0.14
    ì¹ĺ
    -0.13
     trace
    -0.13
    itled
    -0.13
     traces
    -0.13
    uri
    -0.13
    POSITIVE LOGITS
     instead
    0.25
    instead
    0.22
     Instead
    0.19
    Instead
    0.19
    ARING
    0.18
    æ¸Ī
    0.17
     naopak
    0.17
     anonymously
    0.16
    okus
    0.15
    jk
    0.14
    Act Density 0.454%

    No Known Activations