INDEX
    Explanations

    instances of the word "t."

    New Auto-Interp
    Negative Logits
    cination
    -0.55
     persons
    -0.50
    այ
    -0.49
    м
    -0.49
     "
    -0.49
     '
    -0.46
    -0.46
     Persons
    -0.46
    ONLY
    -0.46
    only
    -0.45
    POSITIVE LOGITS
    ’)
    1.42
    ’.
    1.35
    ’).
    1.34
    ’,
    1.33
    ’:
    1.25
    ’?
    1.24
    ’”
    1.22
    ’;
    1.21
    )’
    1.19
    ”),
    1.18
    Act Density 0.092%

    No Known Activations