INDEX
    Explanations

    instances of comparative or contrasting phrases

    New Auto-Interp
    Negative Logits
    /stdc
    -0.19
    avit
    -0.16
    rts
    -0.14
    utral
    -0.14
    .fil
    -0.14
    QUIRE
    -0.14
    579
    -0.13
    ãģ£ãģ¦ãĤĤ
    -0.13
    lek
    -0.13
    .ascii
    -0.13
    POSITIVE LOGITS
    enco
    0.15
    Ģìŀ¥
    0.14
    woord
    0.14
    OrElse
    0.14
    kinson
    0.13
    EP
    0.13
     indiv
    0.13
     Russo
    0.13
     Cart
    0.13
    ãĥŃãĥ³
    0.13
    Act Density 0.109%

    No Known Activations