INDEX
Explanations
prominent and well-known entities, such as figures, thinkers, authors, artists, performers, and personalities
New Auto-Interp
Negative Logits
rontal
-0.75
Operation
-0.70
ãĤ¦ãĤ¹
-0.65
ighter
-0.63
Wonderland
-0.60
dule
-0.60
awar
-0.59
Gun
-0.59
fork
-0.58
Winds
-0.58
POSITIVE LOGITS
whom
1.10
hips
1.09
who
1.09
whose
1.01
hip
1.00
paces
0.96
folk
0.90
admired
0.88
who
0.88
laureate
0.88
Activations Density 0.337%