INDEX
Negative Logits
Sen
-0.52
sh
-0.50
bal
-0.49
sen
-0.49
Bal
-0.48
es
-0.47
Arma
-0.46
Bal
-0.46
sa
-0.46
fjspx
-0.45
POSITIVE LOGITS
ſelf
1.00
myſelf
0.91
ſelves
0.87
ing
0.86
uſed
0.86
itſelf
0.84
ſta
0.82
uſe
0.81
preſent
0.78
pleaſure
0.78
Activations Density 0.093%