INDEX
    Explanations

    connections and references to social relationships, particularly involving friends and family

    New Auto-Interp
    Negative Logits
     friend
    -0.40
     Friend
    -0.40
    friend
    -0.38
    Friend
    -0.36
     friendship
    -0.34
     friendships
    -0.32
     Friends
    -0.31
     friends
    -0.30
    friends
    -0.29
     Friendship
    -0.28
    POSITIVE LOGITS
     foes
    0.28
     aqu
    0.25
     enemies
    0.25
     foe
    0.23
     Enemies
    0.23
     Aqu
    0.21
    col
    0.21
     neighbors
    0.20
     enemy
    0.20
     acqu
    0.20
    Act Density 0.027%

    No Known Activations