INDEX
Explanations
elements related to nature and the environment
New Auto-Interp
Negative Logits
ily
-0.18
shot
-0.16
omb
-0.16
vert
-0.15
enson
-0.15
Pink
-0.15
ern
-0.15
sc
-0.14
UPS
-0.14
ocab
-0.14
POSITIVE LOGITS
skal
0.26
inse
0.23
fj
0.23
troll
0.21
tv
0.21
må
0.18
dv
0.18
moss
0.18
vess
0.18
muss
0.17
Activations Density 0.008%