INDEX
Explanations
words related to opposition or disagreement
references to opposing viewpoints or positions
New Auto-Interp
Negative Logits
Interstitial
-0.80
çīĪ
-0.66
Learns
-0.62
enegger
-0.62
KER
-0.61
Fork
-0.61
Carnival
-0.61
Fallen
-0.60
beit
-0.59
negie
-0.59
POSITIVE LOGITS
onent
1.65
osite
1.63
ortun
1.60
osition
1.51
onents
1.48
osing
1.29
ressive
1.23
ression
1.20
osed
1.17
inion
1.16
Activations Density 0.016%