INDEX
    Explanations

    phrases emphasizing frequency or quantity, particularly the word "nearly"

    New Auto-Interp
    Negative Logits
     Chá»ī
    -0.15
    eted
    -0.15
     stripslashes
    -0.14
    earing
    -0.14
    xAA
    -0.14
    istrat
    -0.14
     Animalia
    -0.14
    ãn
    -0.14
    ilog
    -0.14
    folio
    -0.13
    POSITIVE LOGITS
     exclusively
    0.21
     identical
    0.20
     impossible
    0.20
     dozen
    0.20
    100
    0.18
     always
    0.18
     twice
    0.18
     everything
    0.17
     constant
    0.16
     Impossible
    0.16
    Act Density 0.028%

    No Known Activations