INDEX
    Explanations

    conjunctions and phrases that express contrast or contradiction

    New Auto-Interp
    Negative Logits
    Touches
    -0.16
    elmet
    -0.16
     invent
    -0.15
    inesis
    -0.15
    avanaugh
    -0.15
    uzu
    -0.14
    ibold
    -0.14
    дÑĢом
    -0.14
    леÑĩ
    -0.14
    çī
    -0.14
    POSITIVE LOGITS
     soluble
    0.14
    ابÙĦ
    0.14
    vil
    0.14
     partic
    0.14
    miss
    0.14
     Rae
    0.14
     Skywalker
    0.13
    asher
    0.13
    mes
    0.13
    ÄĽn
    0.13
    Act Density 0.214%

    No Known Activations