INDEX
    Explanations

    instances of the word "opposed."

    New Auto-Interp
    Negative Logits
    ston
    -0.07
    ish
    -0.07
    aze
    -0.06
    asil
    -0.06
    oba
    -0.06
    istics
    -0.06
     minim
    -0.06
    isha
    -0.06
    pra
    -0.06
    olt
    -0.06
    POSITIVE LOGITS
    piler
    0.08
    avad
    0.08
    ìĿ´íĦ°
    0.07
    sing
    0.07
    æĸ¼
    0.07
    avatel
    0.07
    grese
    0.07
    ħn
    0.07
    renom
    0.07
    s
    0.07
    Act Density 0.003%

    No Known Activations