INDEX
    Explanations

    Phrases expressing dualities or contrasts

    concepts that involve duality or being twofold

    New Auto-Interp
    Negative Logits
    sburgh
    -0.74
    ugu
    -0.69
    uez
    -0.69
     Volks
    -0.65
     Kard
    -0.61
     Century
    -0.61
    dq
    -0.60
    uffer
    -0.59
     Caption
    -0.59
    tions
    -0.58
    POSITIVE LOGITS
     sexes
    1.38
     sides
    1.15
     halves
    1.08
     genders
    1.07
     thirds
    0.72
     extremes
    0.70
    imilar
    0.70
    ocating
    0.68
    ãĥīãĥ©ãĤ´ãĥ³
    0.66
     animate
    0.64
    Act Density 0.060%

    No Known Activations