INDEX
Explanations
phrases starting with "the fact that"
the phrase "the fact that."
New Auto-Interp
Negative Logits
aciously
-0.66
rup
-0.66
mes
-0.63
osing
-0.62
idelines
-0.61
ron
-0.61
boards
-0.61
gur
-0.60
OLOG
-0.60
thro
-0.60
POSITIVE LOGITS
happened
0.75
hindsight
0.74
contradicts
0.73
they
0.73
accompanies
0.72
occurred
0.69
happens
0.69
THEY
0.67
occurs
0.67
transpired
0.67
Activations Density 0.130%