INDEX
Explanations
certainty and confidence expressed through affirmative language
New Auto-Interp
Negative Logits
ester
-0.17
etine
-0.16
Bender
-0.15
lez
-0.15
èĸ
-0.14
yt
-0.14
agma
-0.14
Ñĩи
-0.14
ÅĻeb
-0.14
apan
-0.14
POSITIVE LOGITS
ulas
0.15
Dream
0.15
addon
0.14
252
0.14
885
0.14
ijo
0.14
undo
0.14
ICAST
0.14
uchi
0.14
718
0.13
Activations Density 0.245%