INDEX
Explanations
instances of the word "that" used to introduce an explanation or a defining clause
the recurring phrase "the fact that," indicating it is looking for statements presenting assertions or conclusions
New Auto-Interp
Negative Logits
OLOG
-0.68
mes
-0.68
ãĥ¼ãĥ³
-0.67
YC
-0.65
OHN
-0.64
Ble
-0.64
rup
-0.62
HO
-0.61
robe
-0.61
INAL
-0.61
POSITIVE LOGITS
they
0.76
pesky
0.69
contradicts
0.68
hindsight
0.67
chery
0.65
soever
0.65
zsche
0.65
accompanies
0.63
happened
0.63
we
0.62
Activations Density 0.086%