INDEX
Explanations
negative assertions or phrases that convey a lack of something
New Auto-Interp
Negative Logits
Majefty
-0.99
myſelf
-0.83
Cæsar
-0.83
quæ
-0.81
ſeveral
-0.77
respeito
-0.75
Monfieur
-0.72
raiſ
-0.72
ejus
-0.72
itſelf
-0.71
POSITIVE LOGITS
no
1.20
No
0.87
нет
0.83
Нет
0.79
keine
0.78
igno
0.74
no
0.73
ไม่มี
0.70
NO
0.70
geen
0.69
Activations Density 0.127%