INDEX
Explanations
references to climate change and its impacts on societal issues
New Auto-Interp
Negative Logits
bid
-0.15
contrad
-0.14
à¥Ĥà¤ļन
-0.14
irim
-0.13
udder
-0.13
ÏĦή
-0.13
nier
-0.13
ozem
-0.13
asant
-0.13
Forbidden
-0.13
POSITIVE LOGITS
unless
0.23
Unless
0.22
too
0.21
Unless
0.20
too
0.19
Too
0.19
steps
0.19
unless
0.19
Too
0.18
action
0.18
Activations Density 0.411%