INDEX
Explanations
words related to political or social tension
references to escalating conflicts or disagreements
New Auto-Interp
Negative Logits
cise
-0.79
icle
-0.75
ardi
-0.70
icles
-0.69
prints
-0.69
option
-0.69
ãģ®å
-0.65
sole
-0.65
otto
-0.65
glas
-0.63
POSITIVE LOGITS
flared
0.91
cooker
0.87
tension
0.87
tensions
0.86
rained
0.80
flare
0.79
simmer
0.78
escal
0.78
surrounding
0.78
eb
0.77
Activations Density 0.043%