INDEX
Negative Logits
philos
-0.75
sourcing
-0.69
parach
-0.68
distilled
-0.67
snowball
-0.66
pudding
-0.65
stocking
-0.65
recycling
-0.64
masc
-0.64
stim
-0.64
POSITIVE LOGITS
9
1.37
5
1.37
95
1.33
97
1.32
6
1.32
7
1.31
8
1.31
98
1.27
93
1.27
96
1.25
Activations Density 0.067%