INDEX
Negative Logits
hot
-1.02
cup
-0.92
Hot
-0.81
Cup
-0.76
hot
-0.73
Hot
-0.73
Cups
-0.59
ist
-0.58
脚注の使い方
-0.57
cups
-0.56
POSITIVE LOGITS
pleaſure
0.65
myſelf
0.64
ſta
0.63
toid
0.63
ſtate
0.62
ſhould
0.60
enic
0.60
Chriſt
0.59
élé
0.59
theſe
0.59
Activations Density 0.149%