INDEX
Explanations
phrases introducing a contrast or exceptions
the word "Although" in various contexts
New Auto-Interp
Negative Logits
enter
-0.71
taboola
-0.68
tnc
-0.67
ais
-0.67
Eye
-0.65
ãĥĬ
-0.64
ISE
-0.63
elle
-0.62
chase
-0.61
isible
-0.61
POSITIVE LOGITS
acknowledging
0.84
conced
0.80
yip
0.77
olulu
0.72
agreeing
0.72
browsing
0.71
soever
0.69
hattan
0.68
compiling
0.68
REDACTED
0.65
Activations Density 0.021%