INDEX
Explanations
occurrences of the word "We" and its variants in the text
New Auto-Interp
Negative Logits
عÛĮ
-0.16
figuring
-0.15
precated
-0.14
WARDED
-0.14
Lilly
-0.14
ampa
-0.14
aghetti
-0.14
cept
-0.14
vier
-0.14
opia
-0.13
POSITIVE LOGITS
note
0.19
caution
0.17
term
0.17
wish
0.16
remark
0.16
defer
0.16
adher
0.15
refer
0.15
stress
0.15
784
0.15
Activations Density 0.078%