INDEX
Negative Logits
bracht
-0.08
inefficient
-0.08
Gay
-0.07
descriptive
-0.07
cool
-0.07
spr
-0.07
submiss
-0.07
guerra
-0.07
alsa
-0.07
ond
-0.07
POSITIVE LOGITS
过滤
0.11
.Filter
0.11
Filtering
0.10
筛
0.10
(filter
0.09
.filtered
0.09
_Filter
0.09
_cleanup
0.09
(filtered
0.09
Filter
0.09
Activations Density 0.025%