INDEX
Explanations
countries and people's names
names of countries, people, or organizations involved in political contexts
New Auto-Interp
Negative Logits
adobe
-0.79
buquerque
-0.64
Aub
-0.59
Ambro
-0.59
Becky
-0.57
Ung
-0.57
Osc
-0.55
Joy
-0.55
Earn
-0.55
/
-0.54
POSITIVE LOGITS
alike
1.77
respectively
1.13
together
0.90
combined
0.84
jointly
0.81
versa
0.80
together
0.76
separately
0.75
insepar
0.75
combine
0.74
Activations Density 0.390%