INDEX
Explanations
phrases and sentences related to contrast or contradiction
occurrences of phrases that involve conditional or hypothetical scenarios
New Auto-Interp
Negative Logits
PLUS
-0.90
anwhile
-0.76
fleet
-0.75
.?
-0.70
srfAttach
-0.68
GOODMAN
-0.66
ļéĨĴ
-0.66
EStream
-0.65
sqor
-0.63
INS
-0.63
POSITIVE LOGITS
technically
1.24
admittedly
1.16
laud
1.12
undoubtedly
1.09
ostensibly
1.08
initially
1.05
certainly
1.00
theoretically
0.98
admirable
0.96
undeniably
0.94
Activations Density 0.378%