INDEX
Explanations
instances of the word "or."
New Auto-Interp
Negative Logits
msp
-0.17
stime
-0.17
udeau
-0.17
رÙĪØ³
-0.16
tsky
-0.15
elage
-0.15
okus
-0.14
itou
-0.14
inic
-0.14
Segue
-0.14
POSITIVE LOGITS
else
0.19
simply
0.17
anges
0.16
rather
0.15
ooter
0.15
naments
0.15
imas
0.15
maybe
0.15
indeed
0.14
Or
0.14
Activations Density 0.065%