INDEX
Negative Logits
pta
-0.76
soever
-0.75
andel
-0.72
zers
-0.71
perse
-0.69
strain
-0.64
undermin
-0.63
toget
-0.63
oxide
-0.62
cumbers
-0.62
POSITIVE LOGITS
rait
1.41
folios
1.34
folio
1.27
raits
1.27
ugal
1.10
ional
1.02
eur
0.91
nell
0.87
ioned
0.86
ions
0.84
Activations Density 0.009%