INDEX
Explanations
quotes from official statements
statements or quotes from official sources or organizations
New Auto-Interp
Negative Logits
distur
-0.78
underestimated
-0.74
cffffcc
-0.69
haun
-0.68
âĶĢâĶĢâĶĢâĶĢ
-0.68
stim
-0.68
impro
-0.68
snipers
-0.67
halluc
-0.67
reality
-0.66
POSITIVE LOGITS
statement
1.07
Statement
0.95
Statement
0.89
emailed
0.82
stated
0.80
wrote
0.80
disclaimer
0.80
announcing
0.78
wrote
0.78
Pledge
0.77
Activations Density 0.208%