INDEX
    Explanations

    references to social connections and friendship-building

    New Auto-Interp
    Negative Logits
    AfterClass
    -0.61
     CURIAM
    -0.61
    NewReader
    -0.58
     oprot
    -0.55
     hervorge
    -0.54
    ọi
    -0.53
    remot
    -0.53
    leth
    -0.52
    üğ
    -0.52
    #
    -0.52
    POSITIVE LOGITS
    expandindo
    0.83
     gain
    0.69
     gains
    0.69
     earn
    0.64
     gained
    0.64
    earned
    0.63
     learns
    0.62
    gain
    0.61
     Gain
    0.61
    learning
    0.61
    Act Density 0.223%

    No Known Activations