INDEX
    Explanations

    proper nouns, specifically names of individuals

    references to individuals, particularly their actions or attributes

    New Auto-Interp
    Negative Logits
     marginally
    -0.60
    EngineDebug
    -0.59
     (?,
    -0.58
     ."
    -0.57
     ().
    -0.55
     moderately
    -0.53
     âĶľ
    -0.53
     ",
    -0.52
     horm
    -0.51
    ®,
    -0.50
    POSITIVE LOGITS
    ]
    2.59
    ],"
    2.52
    ]"
    2.49
    ]."
    2.41
    ]'
    2.34
    ],
    2.19
    ']
    2.14
    ]-
    2.13
    ].
    2.08
    ?]
    1.89
    Act Density 0.126%

    No Known Activations