INDEX
    Explanations

    phrases that emphasize similarity or consistency

    instances of the phrase "the same" or variations of it related to similarity

    New Auto-Interp
    Negative Logits
     amongst
    -0.71
    wana
    -0.69
    oa
    -0.68
     among
    -0.67
     Leilan
    -0.58
    gew
    -0.58
    abin
    -0.57
    mund
    -0.56
    Soul
    -0.56
    isma
    -0.56
    POSITIVE LOGITS
     same
    2.73
    same
    2.45
     Same
    2.03
    Same
    1.87
     opposite
    1.70
     exact
    1.55
     identical
    1.25
     inverse
    1.09
     reverse
    1.05
    ses
    1.04
    Act Density 0.301%

    No Known Activations