INDEX
    Explanations

    negative contractions and phrases expressing disbelief or rejection

    New Auto-Interp
    Negative Logits
     realization
    -0.20
     realise
    -0.18
     realizes
    -0.16
     realizing
    -0.16
    irk
    -0.15
     realize
    -0.15
    hod
    -0.15
    ureka
    -0.14
    umer
    -0.14
    osen
    -0.13
    POSITIVE LOGITS
     recall
    0.26
     think
    0.25
     recalling
    0.21
    think
    0.21
     remember
    0.20
    recall
    0.20
     fault
    0.20
     recalled
    0.20
     recalls
    0.19
     myself
    0.19
    Act Density 0.059%

    No Known Activations