INDEX
Explanations
statements involving contrast or exceptions
instances of the pronoun "it" and related queries regarding its referents or characteristics
New Auto-Interp
Negative Logits
hitting
-0.73
uilding
-0.71
TT
-0.71
ands
-0.68
Travels
-0.68
soDeliveryDate
-0.68
Tam
-0.66
umbn
-0.64
acca
-0.64
TAG
-0.63
POSITIVE LOGITS
admittedly
0.93
nonetheless
0.86
occasional
0.85
otherwise
0.83
disagree
0.83
retains
0.81
nevertheless
0.81
differed
0.81
slight
0.81
persisted
0.81
Activations Density 0.225%