INDEX
Explanations
pronouns followed by a verb
instances of pronouns and their references in context
New Auto-Interp
Negative Logits
igate
-0.69
screen
-0.68
priv
-0.67
tnc
-0.66
hart
-0.66
orse
-0.66
TT
-0.66
代
-0.65
Forty
-0.65
Opening
-0.65
POSITIVE LOGITS
nonetheless
1.45
nevertheless
1.42
persisted
0.99
still
0.95
alas
0.91
didnt
0.90
'll
0.89
fortunately
0.88
certainly
0.88
agre
0.87
Activations Density 0.306%