INDEX
Explanations
the word "truly" and associated sentiments of authenticity or sincerity
New Auto-Interp
Negative Logits
5
-0.71
silencio
-0.68
ta
-0.67
incompetence
-0.66
UserScript
-0.66
Walling
-0.66
incompati
-0.65
abusi
-0.65
hibernation
-0.64
Hessian
-0.63
POSITIVE LOGITS
truly
2.03
truly
2.01
Truly
1.95
Truly
1.89
verdaderamente
1.15
genuinely
1.06
tru
0.98
cerely
0.89
benar
0.86
true
0.82
Activations Density 0.042%