INDEX
Explanations
the word "both" in various contexts
New Auto-Interp
Negative Logits
hill
-0.15
ted
-0.14
ç̬
-0.14
sey
-0.14
ibase
-0.14
tid
-0.14
iert
-0.13
§
-0.13
amos
-0.13
asto
-0.13
POSITIVE LOGITS
exas
0.16
ilde
0.15
ë§ī
0.15
593
0.15
ouns
0.14
irs
0.14
лÑıн
0.14
esiz
0.13
McConnell
0.13
rav
0.13
Activations Density 0.028%