INDEX
Explanations
proper nouns or names mentioned in a text
phrases that include the word "name" followed by a specific reference to individuals
New Auto-Interp
Negative Logits
alysis
-0.79
ashtra
-0.79
entimes
-0.77
DEV
-0.75
igue
-0.74
ende
-0.72
resil
-0.71
vable
-0.70
âĶľ
-0.69
cape
-0.69
POSITIVE LOGITS
Hussein
0.74
ãĤ±
0.69
aiden
0.69
Haku
0.69
ammad
0.69
nationality
0.69
thood
0.67
Trafford
0.65
Hamm
0.65
suspects
0.64
Activations Density 0.168%