INDEX
Explanations
terms related to fellowships and academic affiliations
New Auto-Interp
Negative Logits
gil
-0.19
bone
-0.18
oola
-0.17
baz
-0.17
okers
-0.16
iginal
-0.15
ophil
-0.15
yclic
-0.15
ÏĦα
-0.15
Voll
-0.14
POSITIVE LOGITS
ships
0.39
ship
0.27
shipping
0.25
hips
0.22
SHIP
0.22
hip
0.22
chaft
0.20
ows
0.20
infeld
0.16
iesen
0.16
Activations Density 0.006%