INDEX
    Explanations

    phrases related to interpersonal interactions and etiquette

    words before "table" or "answer"

    high-frequency function words, especially articles, prepositions, and possessive determiners that frame noun phrases.

    New Auto-Interp
    Negative Logits
    verifyException
    -0.45
    CppCodeGen
    -0.43
    pi
    -0.38
    J
    -0.38
     Seitz
    -0.37
    K
    -0.36
    Rüyada
    -0.36
     Segal
    -0.35
    N
    -0.35
     P
    -0.34
    POSITIVE LOGITS
     bezeichneter
    0.54
     تضيفلها
    0.54
    duled
    0.52
    WriteTagHelper
    0.51
    NUMX
    0.50
    Ӕ
    0.50
    ſchaft
    0.49
    ConstraintMaker
    0.49
    alakip
    0.48
     syst
    0.47
    Act Density 0.029%

    No Known Activations