INDEX
Explanations
phrases related to the dissemination and circulation of information or rumors
New Auto-Interp
Negative Logits
olute
-0.67
bowed
-0.63
equals
-0.60
ona
-0.59
itar
-0.58
pan
-0.57
deport
-0.56
sue
-0.56
Shell
-0.56
ependence
-0.56
POSITIVE LOGITS
rumor
0.98
circulated
0.98
Rum
0.94
rumors
0.88
circulate
0.87
misinformation
0.83
circulating
0.82
online
0.82
0.80
wildfire
0.78
Activations Density 0.079%