INDEX
Explanations
mentions of diseases and their impact on ecosystems or populations
New Auto-Interp
Negative Logits
suffers
-0.18
suffer
-0.18
experience
-0.16
bá»ĭ
-0.15
éģŃ
-0.15
uffers
-0.15
599
-0.15
éĨ
-0.15
uffer
-0.14
masturb
-0.14
POSITIVE LOGITS
targeting
0.19
terror
0.17
cause
0.17
damage
0.17
destroys
0.16
causing
0.16
kill
0.16
destroy
0.15
bankrupt
0.15
nearly
0.15
Activations Density 0.309%