INDEX
    Explanations

    the word "true" in various contexts

    New Auto-Interp
    Negative Logits
    lahoma
    -0.16
    trap
    -0.16
    shed
    -0.16
    shire
    -0.15
    onse
    -0.15
    sburg
    -0.15
    ropolis
    -0.15
    trib
    -0.14
    tron
    -0.14
    land
    -0.13
    POSITIVE LOGITS
    /false
    0.26
    caller
    0.24
    -blue
    0.24
    st
    0.23
    -life
    0.19
    izz
    0.19
    820
    0.17
    sted
    0.17
    -bel
    0.17
    _false
    0.17
    Act Density 0.024%

    No Known Activations