INDEX
    Explanations

    conjunctions that indicate contrast or opposition

    New Auto-Interp
    Negative Logits
    erno
    -0.15
     nicht
    -0.15
    essen
    -0.14
     Bold
    -0.14
    esson
    -0.14
     नह
    -0.14
    adiens
    -0.14
    erset
    -0.14
     не
    -0.14
     không
    -0.14
    POSITIVE LOGITS
     indeed
    0.20
     rather
    0.19
    лÑİ
    0.16
    htub
    0.16
    legg
    0.16
    Rather
    0.15
     Rather
    0.15
    ÃĹ↵↵
    0.15
    ingleton
    0.14
     mo
    0.14
    Act Density 0.050%

    No Known Activations