INDEX
Explanations
words related to aggressive action or conflict
instances of the word "thrash" and its variations in the context of negative actions or outcomes
New Auto-Interp
Negative Logits
REDACTED
-0.84
DEN
-0.80
enegger
-0.79
PowerPoint
-0.67
KEN
-0.67
++++++++++++++++
-0.66
renheit
-0.66
dylib
-0.65
Marino
-0.65
WER
-0.64
POSITIVE LOGITS
ifty
1.09
anches
1.08
inges
1.04
ashing
1.04
assi
1.01
uster
1.00
ongs
0.96
ift
0.94
ights
0.94
ace
0.94
Activations Density 0.015%