INDEX
Explanations
affirmative statements or expressions of certainty
use of certainly
New Auto-Interp
Negative Logits
head
-0.40
bag
-0.39
wing
-0.39
drop
-0.37
ID
-0.35
loss
-0.35
épis
-0.35
dead
-0.35
lif
-0.34
DOD
-0.34
POSITIVE LOGITS
certainly
0.95
certainly
0.92
Certainly
0.82
Certainly
0.82
ptonshire
0.77
ciertamente
0.74
jména
0.74
certamente
0.73
sicherlich
0.69
deſſen
0.69
Activations Density 0.006%