INDEX
    Explanations

    occurrences of the word "select" and various forms of the word "viving" (such as "living")

    New Auto-Interp
    Negative Logits
     Tort
    -0.18
    rai
    -0.16
    lun
    -0.15
    :"-"`↵
    -0.14
    epend
    -0.14
    alloc
    -0.14
     Torres
    -0.14
    acas
    -0.14
     Kurul
    -0.14
    armor
    -0.14
    POSITIVE LOGITS
    hw
    0.16
    etros
    0.15
     Bret
    0.14
    roit
    0.14
     retro
    0.14
    roat
    0.14
    ẽ
    0.13
    etrofit
    0.13
    oland
    0.13
    etro
    0.13
    Act Density 0.007%

    No Known Activations