INDEX
    Explanations

    occurrences of the word "break" in various forms

    New Auto-Interp
    Negative Logits
    irth
    -0.20
    idi
    -0.16
    ensen
    -0.15
    amus
    -0.15
    id
    -0.15
    raz
    -0.14
    idl
    -0.14
    Sdk
    -0.14
    IDI
    -0.14
    uz
    -0.14
    POSITIVE LOGITS
    away
    0.25
     ranks
    0.25
    neck
    0.24
     barriers
    0.23
     records
    0.22
     apart
    0.21
     down
    0.21
     away
    0.20
    -news
    0.20
     barrier
    0.20
    Act Density 0.016%

    No Known Activations