INDEX
Explanations
proper nouns and names
names and references related to specific individuals and their works
New Auto-Interp
Negative Logits
hammad
-0.90
antine
-0.78
usra
-0.78
ted
-0.75
lux
-0.72
icts
-0.70
¥µ
-0.68
dinand
-0.67
rained
-0.67
angelo
-0.67
POSITIVE LOGITS
pring
0.90
Kers
0.82
Hitch
0.78
Seeds
0.75
ould
0.74
creen
0.73
alez
0.70
aceous
0.70
olitics
0.69
apers
0.69
Activations Density 0.026%