INDEX
Explanations
questions or interrogations
instances of inquiries or questions posed
New Auto-Interp
Negative Logits
EStreamFrame
-0.71
marine
-0.70
execute
-0.68
Fit
-0.67
psc
-0.67
endi
-0.66
equal
-0.65
FC
-0.63
rats
-0.63
Right
-0.62
POSITIVE LOGITS
rhet
1.08
quizz
0.99
questions
0.90
ioned
0.90
Questions
0.90
asked
0.84
probing
0.83
sarcast
0.82
Asked
0.78
quer
0.76
Activations Density 0.015%