INDEX
Explanations
words related to news articles and political topics
New Auto-Interp
Negative Logits
ategory
-0.87
illary
-0.80
rush
-0.80
omore
-0.77
hyde
-0.74
iferation
-0.70
hip
-0.70
iolet
-0.70
emonic
-0.68
amera
-0.66
POSITIVE LOGITS
spring
1.16
enough
1.01
behaved
1.00
suited
0.95
enough
0.93
esley
0.91
vers
0.85
deserved
0.84
baum
0.80
acquainted
0.78
Activations Density 0.367%