INDEX
    Explanations

    references to sentiment around shared interests and likability

    New Auto-Interp
    Negative Logits
     like
    -0.18
    cape
    -0.15
    ogy
    -0.15
    शन
    -0.15
    ált
    -0.14
    roe
    -0.14
    locate
    -0.14
    ÙĬج
    -0.14
    dy
    -0.14
    linux
    -0.14
    POSITIVE LOGITS
     minded
    0.39
    -minded
    0.39
     Minds
    0.26
    WISE
    0.26
    able
    0.25
     minds
    0.25
    hood
    0.22
    ability
    0.21
    ewise
    0.20
    inded
    0.20
    Act Density 0.033%

    No Known Activations