INDEX
Explanations
terms related to public statements or opinions
instances of commentary or remarks made by individuals, particularly in a contentious or newsworthy context
New Auto-Interp
Negative Logits
ISH
-0.75
Recon
-0.70
PG
-0.68
fruit
-0.67
BLIC
-0.66
YP
-0.63
Rescue
-0.62
Brill
-0.62
Cav
-0.60
resin
-0.60
POSITIVE LOGITS
uttered
1.01
comments
0.90
remarks
0.89
guiActiveUn
0.82
aloud
0.81
ariat
0.77
dispar
0.72
slurs
0.71
storms
0.69
aturday
0.69
Activations Density 0.061%