INDEX
Explanations
statements indicating proof or verification of a claim
the phrase "that" in various contexts
New Auto-Interp
Negative Logits
gur
-0.66
Guard
-0.59
erves
-0.58
Topics
-0.58
isers
-0.56
Wide
-0.56
MH
-0.55
è»
-0.55
erve
-0.54
estones
-0.54
POSITIVE LOGITS
contradicts
0.77
they
0.75
cher
0.73
fateful
0.65
pesky
0.64
soever
0.64
we
0.64
somebody
0.64
someone
0.63
proves
0.63
Activations Density 0.286%