INDEX
    Explanations

    references to separation or differences between entities or concepts

    New Auto-Interp
    Negative Logits
     double
    -0.16
    loff
    -0.15
    cken
    -0.14
    ìŀIJìĿ¸
    -0.14
    separator
    -0.14
    double
    -0.14
    alet
    -0.14
    reten
    -0.13
    ses
    -0.13
    zes
    -0.13
    POSITIVE LOGITS
     unrelated
    0.21
     equally
    0.21
    -Compatible
    0.19
     incompatible
    0.18
    /new
    0.17
     person
    0.16
    agenda
    0.15
     entirely
    0.15
    iator
    0.15
     completamente
    0.15
    Act Density 0.170%

    No Known Activations