INDEX
    Explanations

    instances of the word "was"

    New Auto-Interp
    Negative Logits
    acers
    -0.20
    eec
    -0.16
    antly
    -0.15
    olson
    -0.15
     flap
    -0.15
    eus
    -0.15
     Lod
    -0.15
    sx
    -0.14
    acias
    -0.14
    bart
    -0.14
    POSITIVE LOGITS
    abi
    0.27
    illa
    0.22
    abis
    0.22
    htub
    0.22
    atch
    0.20
    ps
    0.19
    ILLA
    0.18
    ABI
    0.18
    abe
    0.17
     ist
    0.17
    Act Density 0.043%

    No Known Activations