INDEX
Explanations
inquiries and responses within a dialogue context
New Auto-Interp
Negative Logits
wonder
-0.18
asking
-0.17
ask
-0.17
Answers
-0.17
缼
-0.17
activex
-0.15
quired
-0.15
ascal
-0.15
Ask
-0.15
asking
-0.15
POSITIVE LOGITS
questions
0.36
burning
0.29
Questions
0.27
questions
0.27
Questions
0.24
tough
0.23
question
0.23
Burning
0.22
вопÑĢоÑģ
0.20
burn
0.20
Activations Density 0.069%