INDEX
Explanations
conjunctions indicating contrast or exception
discourse markers that introduce contrasting information
New Auto-Interp
Negative Logits
segment
-0.77
Cathedral
-0.60
oriented
-0.59
dedicated
-0.57
mound
-0.57
perceived
-0.57
trad
-0.56
segments
-0.56
commentary
-0.56
hall
-0.55
POSITIVE LOGITS
tons
1.80
tery
1.18
ts
1.15
chers
1.05
tern
1.02
chery
1.02
alas
1.01
ylene
0.98
tle
0.98
cher
0.94
Activations Density 0.103%