INDEX
Explanations
emotional expressions and sentiments towards experiences or evaluations
New Auto-Interp
Negative Logits
-1.68
,
-1.58
(
-1.55
.
-1.48
-
-1.46
1
-1.42
2
-1.40
↵↵
-1.39
↵
-1.38
in
-1.37
POSITIVE LOGITS
desmotivaciones
1.20
étoient
1.14
avoient
1.13
étoit
1.06
feroit
1.05
auroit
0.96
pleaſure
0.96
pouvoit
0.94
pérd
0.92
miniaturka
0.91
Activations Density 17.417%