INDEX
    Explanations

    references to the concept of "word" in various contexts

    New Auto-Interp
    Negative Logits
     myſelf
    -1.07
    %]
    -1.01
     pleaſure
    -0.98
    ."));
    -0.98
     himſelf
    -0.97
    "]];
    -0.96
     photolibrary
    -0.96
     Monfieur
    -0.96
    "]]
    -0.95
     Majefty
    -0.92
    POSITIVE LOGITS
     words
    1.76
     Words
    1.64
     word
    1.63
     Word
    1.61
    Word
    1.54
     WORD
    1.53
     WORDS
    1.47
    Words
    1.46
    word
    1.43
    words
    1.39
    Act Density 0.041%

    No Known Activations