INDEX
    Explanations

    mentions of friends and family

    New Auto-Interp
    Negative Logits
     Friend
    -0.32
    Friend
    -0.30
    friend
    -0.29
     friend
    -0.28
     friendship
    -0.27
     Friendship
    -0.24
     friendships
    -0.23
     Friends
    -0.21
    _friend
    -0.20
     дÑĢÑĥж
    -0.20
    POSITIVE LOGITS
     foes
    0.31
     enemies
    0.29
     acquaint
    0.28
     neighbors
    0.27
     Enemies
    0.27
     neighbours
    0.25
     colleagues
    0.25
     foe
    0.24
     associates
    0.24
     relatives
    0.24
    Act Density 0.025%

    No Known Activations