INDEX
Explanations
phrases related to causing harm or negative consequences
New Auto-Interp
Negative Logits
Tatsache
-0.78
קישורים
-0.76
paddingVertical
-0.72
للمعارف
-0.67
Adri
-0.65
strå
-0.64
Haller
-0.63
ppins
-0.63
mellitus
-0.63
Realität
-0.62
POSITIVE LOGITS
cause
1.39
CAUSE
1.37
Caus
1.36
Causes
1.35
Cause
1.35
causes
1.32
causes
1.30
caused
1.28
caused
1.26
cause
1.25
Activations Density 0.093%