INDEX
Explanations
phrases related to negative issues or problems
references to negative conditions or issues affecting society
New Auto-Interp
Negative Logits
ials
-0.80
dp
-0.78
ettel
-0.77
ibling
-0.77
earchers
-0.76
initions
-0.76
Streamer
-0.75
ynthesis
-0.74
edar
-0.74
ndra
-0.71
POSITIVE LOGITS
plague
1.42
plag
0.93
Plague
0.93
scourge
0.87
blight
0.84
infect
0.79
bugs
0.78
bug
0.77
swarm
0.76
Nost
0.75
Activations Density 0.011%