INDEX
Explanations
phrases or sentences that introduce a contrast or counterpoint
the word "that" used in various contexts
New Auto-Interp
Negative Logits
å§«
-0.75
sbm
-0.74
byss
-0.72
pots
-0.72
pill
-0.72
pecially
-0.72
hops
-0.71
dstg
-0.71
olics
-0.71
adelphia
-0.71
POSITIVE LOGITS
doesn
1.00
ignores
0.93
hasn
0.92
doesnt
0.90
nonetheless
0.89
depends
0.89
differs
0.86
isn
0.85
didn
0.84
wasn
0.84
Activations Density 0.106%