INDEX
    Explanations

    discourse markers or expressions that indicate contrast or transitions in thought

    New Auto-Interp
    Negative Logits
    ans
    -0.15
    _joint
    -0.14
    rea
    -0.14
    ansa
    -0.14
    otec
    -0.14
    ycler
    -0.14
    lix
    -0.14
    quals
    -0.14
    fel
    -0.14
    vey
    -0.13
    POSITIVE LOGITS
    cala
    0.15
    okane
    0.15
    codegen
    0.15
    tml
    0.15
    bum
    0.14
    enas
    0.14
    ieux
    0.14
     rych
    0.14
    éIJĺ
    0.14
    aces
    0.14
    Act Density 0.000%

    No Known Activations