INDEX
Explanations
words or phrases related to criticism and political statements
sentences that include declarative statements, particularly those indicating serious issues or opinions
New Auto-Interp
Negative Logits
volunte
-0.85
tremend
-0.84
gobl
-0.80
confir
-0.73
corrid
-0.71
millenn
-0.71
purse
-0.70
unnecess
-0.69
defic
-0.69
challeng
-0.69
POSITIVE LOGITS
His
2.12
He
2.11
His
1.79
He
1.66
Himself
1.47
his
1.38
his
1.26
he
1.25
HIS
1.24
Asked
1.15
Activations Density 0.607%