INDEX
    Explanations

    social circles or networks

    New Auto-Interp
    Negative Logits
     comrades
    -0.12
     teammate
    -0.11
     Neighbor
    -0.11
     fellow
    -0.10
     neighbor
    -0.10
     colleague
    -0.10
     glo
    -0.09
    omen
    -0.09
     bil
    -0.09
     neighbour
    -0.09
    POSITIVE LOGITS
     circle
    0.49
     circles
    0.42
    circle
    0.37
     Circle
    0.37
    åľĪ
    0.35
    Circle
    0.34
     networks
    0.34
    -circle
    0.33
     network
    0.33
    _circle
    0.28
    Act Density 0.109%

    No Known Activations