INDEX
Explanations
instances of the word "And" used to connect thoughts or ideas
New Auto-Interp
Negative Logits
anes
-0.17
ught
-0.15
ural
-0.14
erties
-0.14
ighted
-0.14
strtolower
-0.13
mlink
-0.13
Spar
-0.13
ova
-0.13
etric
-0.13
POSITIVE LOGITS
zwar
0.15
isman
0.15
isd
0.15
reater
0.15
LLU
0.14
εί
0.14
olean
0.14
ongo
0.14
jos
0.14
_atom
0.14
Activations Density 0.061%