INDEX
    Explanations

    references to personal relationships and interactions

    New Auto-Interp
    Negative Logits
    -0.53
    WithTag
    -0.53
    kof
    -0.51
    IndentedString
    -0.50
     ordering
    -0.47
     appraisal
    -0.47
     bulunabilir
    -0.47
     doty
    -0.46
     intercession
    -0.46
     pomo
    -0.45
    POSITIVE LOGITS
     never
    1.06
     can
    0.97
     always
    0.97
     have
    0.97
     would
    0.96
     had
    0.95
     didn
    0.95
     still
    0.93
     don
    0.90
     will
    0.87
    Act Density 1.120%

    No Known Activations