INDEX
Explanations
instances of the word "or" in various contexts
New Auto-Interp
Negative Logits
ept
-0.17
Rica
-0.16
gue
-0.16
ibern
-0.15
kou
-0.15
onor
-0.15
keley
-0.15
ey
-0.15
beth
-0.15
zing
-0.14
POSITIVE LOGITS
acular
0.18
iesz
0.18
ifice
0.17
acles
0.17
iol
0.15
icht
0.15
acy
0.15
else
0.15
wel
0.15
063
0.14
Activations Density 0.037%