INDEX
Explanations
mentions of boycotts
references to boycott-related terms
New Auto-Interp
Negative Logits
reluct
-0.87
confir
-0.87
exha
-0.83
srf
-0.79
actu
-0.79
nesota
-0.78
planner
-0.77
concurrent
-0.76
senal
-0.76
¥ŀ
-0.75
POSITIVE LOGITS
cott
1.64
ijn
0.93
ages
0.87
wright
0.87
ione
0.86
riott
0.84
ieu
0.83
aign
0.83
olkien
0.82
ards
0.82
Activations Density 0.008%