INDEX
Explanations
instances of the word "that."
New Auto-Interp
Negative Logits
pery
-0.18
ãĤ¤ãĥ¤
-0.18
BuilderFactory
-0.17
navr
-0.16
deaux
-0.16
iaux
-0.16
Means
-0.16
rud
-0.15
theid
-0.15
eid
-0.14
POSITIVE LOGITS
way
0.56
-way
0.37
way
0.35
Way
0.34
_way
0.32
.way
0.30
WAY
0.29
away
0.29
Way
0.28
direction
0.28
Activations Density 0.018%