INDEX
Explanations
expressions of enthusiasm and excitement about experiences
New Auto-Interp
Negative Logits
:
-0.58
-0.57
?
-0.56
(‘
-0.55
(“
-0.55
—
-0.54
'';
-0.54
Zig
-0.53
=
-0.52
\}\\
-0.52
POSITIVE LOGITS
popolari
0.79
suivant
0.71
politiet
0.70
giustizia
0.68
clô
0.67
fidélité
0.67
maioria
0.66
victimes
0.66
importanza
0.66
cauza
0.65
Activations Density 0.236%