INDEX
    Explanations

    words that suggest binaries or oppositional concepts

    New Auto-Interp
    Negative Logits
    imb
    -0.15
    aml
    -0.15
    arine
    -0.15
    ond
    -0.15
    uel
    -0.15
    late
    -0.15
     Im
    -0.14
    urma
    -0.14
    ylum
    -0.14
    eer
    -0.14
    POSITIVE LOGITS
    aku
    0.17
    å¢
    0.16
    zier
    0.14
    afone
    0.13
    awan
    0.13
    757
    0.13
     RequestMethod
    0.13
     effectively
    0.13
    ê
    0.13
    ाà¤ı
    0.13
    Act Density 0.039%

    No Known Activations