INDEX
Explanations
terms related to false responses or incorrect answers in various contexts
Non-English language fragments
Romance and Slavic language word endings
New Auto-Interp
Negative Logits
purpoſe
-1.17
itſelf
-1.15
himſelf
-1.06
iſt
-1.06
myſelf
-1.05
Monfieur
-1.03
houſe
-1.03
Efq
-1.01
raiſ
-1.01
ſtate
-1.00
POSITIVE LOGITS
ones
0.65
которые
0.54
0.51
fantasies
0.49
b
0.49
Kanpo
0.45
d
0.44
vode
0.43
von
0.43
k
0.43
Activations Density 0.019%