INDEX
Negative Logits
taking
-1.20
Taking
-1.13
Taking
-1.08
takes
-0.87
taking
-0.82
take
-0.80
er
-0.76
tomando
-0.71
takers
-0.68
tak
-0.66
POSITIVE LOGITS
pleaſure
0.72
poffe
0.68
TemporalType
0.66
ffions
0.66
leaſt
0.65
ſmall
0.64
eace
0.63
.*")]
0.63
__).
0.63
poffible
0.63
Activations Density 0.111%