INDEX
    Explanations

    words related to collective or shared experiences

    New Auto-Interp
    Negative Logits
     they
    -0.49
     she
    -0.46
     вони
    -0.46
    they
    -0.41
    BOTH
    -0.41
     они
    -0.40
     onlar
    -0.39
     horizontally
    -0.38
     Both
    -0.38
     ella
    -0.38
    POSITIVE LOGITS
     its
    1.10
     how
    0.91
     their
    0.88
     její
    0.73
     cómo
    0.73
     seus
    0.71
     bagaimana
    0.70
     related
    0.68
     ihrer
    0.68
     suas
    0.66
    Act Density 0.631%

    No Known Activations