INDEX
Explanations
emotional experiences and relationships
New Auto-Interp
Negative Logits
VERRIDE
-0.20
çĶ
-0.17
.respond
-0.17
xde
-0.15
Replies
-0.15
osoph
-0.15
ä¸ĭ载次æķ°
-0.14
elles
-0.14
Kum
-0.14
Äįka
-0.14
POSITIVE LOGITS
statement
0.17
correct
0.17
statements
0.15
elabor
0.14
logic
0.14
correct
0.14
Colony
0.14
szcz
0.14
tone
0.14
point
0.13
Activations Density 0.263%