INDEX
    Explanations

    sentence structures indicating the presence of information or knowledge

    references to changes and what is known or requires action

    New Auto-Interp
    Negative Logits
    gart
    -0.77
    asers
    -0.69
    aser
    -0.66
     IDs
    -0.65
    chairs
    -0.60
     heels
    -0.59
     Outbreak
    -0.58
    upiter
    -0.57
    nels
    -0.57
    motion
    -0.57
    POSITIVE LOGITS
     done
    0.99
     accomplished
    0.94
     learnt
    0.88
     transpired
    0.87
     learned
    0.83
     glean
    0.83
     written
    0.82
     redacted
    0.80
    FINE
    0.78
     undone
    0.76
    Act Density 0.149%

    No Known Activations