INDEX
    Explanations

    phrases related to ability or incapability

    negations and assertions of ability

    New Auto-Interp
    Negative Logits
    Lens
    -0.64
     Rutherford
    -0.62
     enthusi
    -0.62
     Brent
    -0.60
    IDES
    -0.57
     millenn
    -0.56
    ELD
    -0.56
     prompts
    -0.55
     conspicuous
    -0.55
     Mant
    -0.55
    POSITIVE LOGITS
    't
    2.33
    NOT
    1.32
    adian
    1.17
     afford
    1.14
    na
    1.06
     hardly
    1.05
    ´
    1.00
    berra
    0.99
    ny
    0.99
     handle
    0.96
    Act Density 0.143%

    No Known Activations