INDEX
Explanations
geographic locations and proper nouns associated with specific places
New Auto-Interp
Negative Logits
+:+
-0.16
Stall
-0.15
jes
-0.14
Aviv
-0.14
Richards
-0.14
bru
-0.14
367
-0.14
Cousins
-0.14
gger
-0.13
alf
-0.13
POSITIVE LOGITS
Stateless
0.17
aukee
0.17
oodle
0.17
Indian
0.16
Indians
0.16
Backbone
0.15
dfd
0.15
ipi
0.15
iná
0.15
oga
0.15
Activations Density 0.067%