INDEX
Explanations
interrogative phrases and questions directed at individuals about their experiences and preferences
New Auto-Interp
Negative Logits
elter
-0.16
uzu
-0.15
annes
-0.15
amm
-0.15
yle
-0.15
ůst
-0.14
Farmer
-0.14
razier
-0.14
wonders
-0.14
usk
-0.14
POSITIVE LOGITS
advice
0.24
Advice
0.20
Advice
0.17
message
0.15
advise
0.15
sunk
0.15
favorite
0.15
使
0.15
favorite
0.15
vala
0.14
Activations Density 0.044%