INDEX
Explanations
mentions of specific names or individuals, possibly related to news articles or stories
New Auto-Interp
Negative Logits
Els
-0.70
Dominion
-0.68
Load
-0.67
Agric
-0.65
ories
-0.64
Pixie
-0.64
oard
-0.63
Native
-0.63
Georg
-0.62
Krish
-0.60
POSITIVE LOGITS
Jr
1.19
Sr
0.95
agher
0.88
III
0.87
himself
0.86
famously
0.86
confid
0.85
joked
0.84
Jr
0.83
ovich
0.83
Activations Density 1.068%