INDEX
    Explanations

    contractions of words with specific characters such as 'n't'

    instances of the word "wouldn't" in various contexts

    New Auto-Interp
    Negative Logits
    æ³
    -0.68
    ARM
    -0.68
    story
    -0.66
    ULT
    -0.63
    PI
    -0.62
     Case
    -0.62
    PU
    -0.61
    ocal
    -0.60
    Adv
    -0.60
    agency
    -0.60
    POSITIVE LOGITS
    't
    1.09
    ģĸ
    0.82
     never
    0.79
    terness
    0.78
    atically
    0.76
    geon
    0.76
    ¹
    0.74
     surely
    0.73
    ¨
    0.73
    ±
    0.73
    Act Density 0.009%

    No Known Activations