INDEX
Explanations
proper nouns, specifically human names or titles
phrases or expressions that include the word "of" related to names
New Auto-Interp
Negative Logits
iasm
-0.74
touch
-0.71
agy
-0.69
issions
-0.67
ists
-0.66
Divide
-0.65
idelines
-0.65
estyles
-0.65
ptions
-0.62
venants
-0.60
POSITIVE LOGITS
Uzbek
0.80
culprit
0.74
ãĤ·ãĥ£
0.68
ãĤ¬
0.65
Shots
0.64
ãĤ´
0.63
suspects
0.62
Ox
0.62
otta
0.62
Killer
0.61
Activations Density 0.128%