INDEX
Explanations
information related to curfews and enforcement actions in specific districts
New Auto-Interp
Negative Logits
Rumors
-0.65
uș
-0.65
Honorable
-0.65
behaviors
-0.64
își
-0.62
señores
-0.61
rumors
-0.60
Honorable
-0.59
Fucking
-0.57
FUCKING
-0.56
POSITIVE LOGITS
DebuggerNonUser
0.80
,’’
0.74
artistes
0.73
IANS
0.71
',"
0.70
),"
0.69
cerpts
0.67
OGND
0.67
expandindo
0.67
���
0.66
Activations Density 0.135%