INDEX
Explanations
dialogue punctuation, specifically question marks and quotation marks
punctuation marks, particularly periods and commas
New Auto-Interp
Negative Logits
confir
-0.91
jri
-0.88
ividual
-0.86
proport
-0.82
reconc
-0.81
undermin
-0.81
regate
-0.80
commer
-0.80
tremend
-0.78
volunte
-0.78
POSITIVE LOGITS
Slowly
1.45
Suddenly
1.39
Turning
1.35
Damn
1.28
Something
1.26
Somehow
1.26
Tears
1.25
Looking
1.24
Then
1.23
Seeing
1.23
Activations Density 0.117%