INDEX
Explanations
terms related to environmental concerns and sustainability
New Auto-Interp
Negative Logits
o
-0.35
ole
-0.27
l
-0.26
af
-0.25
a
-0.25
oa
-0.25
oom
-0.24
ales
-0.24
ape
-0.24
ri
-0.24
POSITIVE LOGITS
s
0.22
raft
0.19
̧
0.19
sak
0.18
sse
0.16
sb
0.16
sol
0.16
ourt
0.16
èĽĭ
0.16
sat
0.15
Activations Density 0.146%