INDEX
Explanations
personal pronouns and related verbs
first-person pronouns and phrases expressing personal opinions or statements
New Auto-Interp
Negative Logits
imum
-0.80
é¾įå¥ij士
-0.75
ðĿ
-0.74
quartered
-0.73
unknown
-0.71
unin
-0.71
assembly
-0.71
-+-+
-0.70
tnc
-0.68
phans
-0.68
POSITIVE LOGITS
lied
0.76
exagger
0.75
mention
0.74
kidding
0.73
forgot
0.72
exaggeration
0.70
typo
0.68
guessed
0.68
cheated
0.68
cried
0.67
Activations Density 0.153%