INDEX
    Explanations

    adverbs indicating degree or intensity

    phrases that express varying degrees of surprise or amazement

    New Auto-Interp
    Negative Logits
    ature
    -0.68
    UME
    -0.66
    isher
    -0.65
    odder
    -0.65
    izu
    -0.63
    oris
    -0.61
     Ans
    -0.60
    agonists
    -0.60
    ograph
    -0.59
    uthor
    -0.59
    POSITIVE LOGITS
    HCR
    0.93
    ls
    0.87
    soever
    0.83
     much
    0.81
    ling
    0.80
    bill
    0.77
    beit
    0.77
     MUCH
    0.77
    ells
    0.76
    much
    0.75
    Act Density 0.083%

    No Known Activations