INDEX
    Explanations

    phrases that indicate comparisons or analogies

    New Auto-Interp
    Negative Logits
    #$
    -0.66
    Posts
    -0.66
    aldo
    -0.65
    rous
    -0.65
    ãģł
    -0.64
    OUS
    -0.63
    itton
    -0.62
    regon
    -0.59
    ouk
    -0.59
    OUR
    -0.59
    POSITIVE LOGITS
    pired
    1.21
    pires
    1.20
     portrayed
    0.97
     depicted
    0.95
    ociated
    0.95
    pects
    0.93
     well
    0.93
    phy
    0.91
    semb
    0.89
    pire
    0.88
    Act Density 0.084%

    No Known Activations