INDEX
Explanations
questions or statements ended in question marks
dialogue and questions expressed by characters
New Auto-Interp
Negative Logits
------------------------------------------------
-0.75
endorsements
-0.66
accompl
-0.63
successes
-0.60
victories
-0.59
accord
-0.57
acknowled
-0.57
fixes
-0.57
surviving
-0.57
toc
-0.56
POSITIVE LOGITS
asked
1.88
asks
1.84
inquired
1.74
wondered
1.53
ask
1.48
questions
1.47
quer
1.43
question
1.38
questioned
1.38
demanded
1.32
Activations Density 0.136%