INDEX
Explanations
topics related to politics and social issues
topics related to political and social issues
New Auto-Interp
Negative Logits
thouse
-0.90
oso
-0.72
habi
-0.70
arnaev
-0.70
Airbus
-0.68
Airl
-0.67
vik
-0.63
Pont
-0.63
sonian
-0.63
udicrous
-0.62
POSITIVE LOGITS
topics
1.15
issues
1.11
matters
1.08
etiquette
1.02
affordability
0.96
versus
0.92
topic
0.91
ethics
0.91
needing
0.90
issues
0.89
Activations Density 0.689%