INDEX
Explanations
references to plants and plant-based products
New Auto-Interp
Negative Logits
plants
-0.19
plants
-0.18
ceph
-0.17
him
-0.15
ichten
-0.15
paque
-0.15
eration
-0.15
noop
-0.15
sheets
-0.14
ponent
-0.14
POSITIVE LOGITS
ations
0.33
ain
0.27
ains
0.27
ers
0.24
ings
0.24
ar
0.23
AINS
0.23
/tree
0.20
AIN
0.20
kingdom
0.19
Activations Density 0.024%