INDEX
Explanations
proper names related to politics or sports, specifically identifying individuals
mentions of specific individuals, particularly in the context of events or circumstances related to death or investigation
New Auto-Interp
Negative Logits
aneously
-0.75
ishable
-0.74
ished
-0.73
ATURES
-0.73
ners
-0.73
************
-0.72
atted
-0.71
ATURE
-0.71
ICAN
-0.69
abwe
-0.68
POSITIVE LOGITS
traged
0.93
ortment
0.82
ilon
0.74
byss
0.73
xus
0.73
unden
0.70
metry
0.70
mos
0.68
sembly
0.68
thur
0.68
Activations Density 0.060%