INDEX
Explanations
words related to negative actions or criticism
terms related to mismanagement or mistakes
New Auto-Interp
Negative Logits
ILA
-0.68
INGS
-0.65
eteria
-0.64
Destroyer
-0.62
unto
-0.61
¯¯¯¯
-0.61
iveness
-0.61
ieri
-0.60
sans
-0.60
Ready
-0.59
POSITIVE LOGITS
cellaneous
1.43
appropri
1.30
beh
1.15
behavior
1.11
informed
1.11
character
1.10
aligned
1.06
pelled
1.04
jud
1.03
managed
1.02
Activations Density 0.019%