INDEX
Explanations
requests or solicitations
New Auto-Interp
Negative Logits
zinski
-0.72
Ĥ¬
-0.71
cutting
-0.67
luaj
-0.67
ccording
-0.66
rongh
-0.66
Scouting
-0.63
âĶĢâĶĢ
-0.60
Nanto
-0.60
zar
-0.60
POSITIVE LOGITS
questions
1.17
rhet
1.08
forgiveness
1.05
naires
1.04
probing
1.00
permission
0.93
plaint
0.89
ingly
0.87
politely
0.86
respondents
0.85
Activations Density 0.047%