INDEX
Negative Logits
deck
-0.55
forming
-0.54
unda
-0.54
semble
-0.53
warts
-0.52
ularity
-0.51
ities
-0.51
ular
-0.50
signatures
-0.50
ulation
-0.50
POSITIVE LOGITS
ĵ
0.67
asis
0.62
§
0.62
OSP
0.57
Ĭ
0.57
ouk
0.57
ħ
0.54
uh
0.54
Ezek
0.51
rade
0.51
Activations Density 0.220%