INDEX
Explanations
expressions related to informative content and personal experiences
New Auto-Interp
Negative Logits
astify
-0.69
Попис
-0.57
twór
-0.55
Географија
-0.54
锈钢
-0.53
bebe
-0.51
hlon
-0.51
ytale
-0.51
apunov
-0.49
Palabras
-0.49
POSITIVE LOGITS
informative
1.16
enlightening
0.98
interesting
0.98
useful
0.95
helpful
0.90
Inform
0.90
inform
0.90
insightful
0.89
illuminating
0.88
Inform
0.86
Activations Density 0.366%