INDEX
Explanations
verbs or nouns related to causing harm or damage
words related to destruction or damage
New Auto-Interp
Negative Logits
DragonMagazine
-0.87
uchs
-0.74
zbek
-0.70
gomery
-0.69
ussen
-0.68
uana
-0.67
acus
-0.66
liner
-0.66
KER
-0.65
glas
-0.64
POSITIVE LOGITS
havoc
1.34
roying
0.83
adoes
0.82
wrought
0.82
arte
0.73
habitats
0.73
wre
0.73
nests
0.71
piles
0.71
ados
0.71
Activations Density 0.092%