INDEX
Negative Logits
glers
-0.83
hail
-0.68
cones
-0.62
bother
-0.58
fires
-0.58
fork
-0.58
cigars
-0.57
spiders
-0.57
Typhoon
-0.57
snakes
-0.56
POSITIVE LOGITS
illary
1.00
wered
1.00
heim
0.94
imity
0.93
otropic
0.88
obic
0.83
ibo
0.82
orage
0.81
andre
0.79
estic
0.76
Activations Density 0.041%