INDEX
    Explanations

    phrases or clauses introducing contrasting or explanatory information

    repetitions of the word "that"

    New Auto-Interp
    Negative Logits
    acca
    -0.80
    oras
    -0.69
    arine
    -0.67
    orse
    -0.67
    kay
    -0.66
    aro
    -0.66
    WT
    -0.66
    ña
    -0.66
     Forty
    -0.65
    tnc
    -0.65
    POSITIVE LOGITS
     nonetheless
    1.49
     nevertheless
    1.40
     alas
    0.98
    etheless
    0.95
     still
    0.92
     beware
    0.86
     persisted
    0.84
     fortunately
    0.78
     persists
    0.78
     never
    0.77
    Act Density 0.370%

    No Known Activations