INDEX
Explanations
proper nouns related to a specific individual
references to specific individuals' names, particularly those associated with a significant event or controversy
New Auto-Interp
Negative Logits
lihood
-0.92
es
-0.72
Veg
-0.71
chal
-0.70
eed
-0.67
ees
-0.66
Kend
-0.65
cha
-0.64
Corpus
-0.63
esity
-0.63
POSITIVE LOGITS
flush
0.88
adeon
0.88
strip
0.79
ategic
0.79
itamin
0.77
ushing
0.76
utor
0.76
chnology
0.74
ogun
0.72
ouses
0.70
Activations Density 0.025%