INDEX
Explanations
proper nouns related to political figures or entities
lists of endorsed individuals or items within a specific context
New Auto-Interp
Negative Logits
olding
-0.78
uchi
-0.70
arden
-0.68
utters
-0.65
protection
-0.64
sense
-0.64
confines
-0.64
old
-0.62
Riyadh
-0.62
olds
-0.61
POSITIVE LOGITS
notable
0.97
noteworthy
0.96
starred
0.92
memorable
0.91
featured
0.87
ranked
0.85
listed
0.85
nominated
0.84
participated
0.80
endors
0.79
Activations Density 0.548%