INDEX
Negative Logits
such
0.58
Σ
0.52
a
0.51
f
0.48
society
0.48
Housewives
0.48
something
0.47
ν
0.47
الت
0.46
Fog
0.46
POSITIVE LOGITS
thumb
0.79
pouce
0.71
thumbs
0.68
Thumb
0.66
ใช่
0.55
Thumb
0.54
grü
0.52
thump
0.52
↵
0.51
liking
0.51
Activations Density 0.004%