INDEX
Explanations
references to historical figures and their relationships
New Auto-Interp
Negative Logits
idar
-0.15
ueur
-0.15
Andres
-0.15
Hab
-0.14
apur
-0.14
ammer
-0.14
parsley
-0.14
Ezra
-0.13
Prefs
-0.13
anonymous
-0.13
POSITIVE LOGITS
count
0.20
Counts
0.19
princ
0.18
counts
0.18
Bourbon
0.18
Counts
0.17
Count
0.17
/count
0.17
Prince
0.17
semb
0.15
Activations Density 0.056%