INDEX
Explanations
mentions of specific individuals or proper nouns
proper nouns and names related to individuals or entities
New Auto-Interp
Negative Logits
Vaugh
-0.79
toile
-0.65
CBI
-0.65
redo
-0.63
Antar
-0.59
Notting
-0.59
mot
-0.59
fate
-0.58
ettings
-0.58
canyon
-0.57
POSITIVE LOGITS
intosh
0.94
htar
0.81
ROR
0.79
ippi
0.79
arella
0.78
nih
0.78
ouf
0.77
agara
0.72
henko
0.70
angelo
0.70
Activations Density 0.808%