INDEX
Explanations
references to plants and plant-based products or materials
New Auto-Interp
Negative Logits
plants
-0.18
plants
-0.17
ceph
-0.17
him
-0.16
eration
-0.15
paque
-0.15
ichten
-0.15
heet
-0.14
noop
-0.14
bitte
-0.14
POSITIVE LOGITS
ations
0.34
ain
0.30
ains
0.28
ers
0.24
ings
0.23
ar
0.23
AINS
0.23
/tree
0.22
igr
0.21
ation
0.21
Activations Density 0.023%