INDEX
Explanations
reports of violent or brutal acts
New Auto-Interp
Negative Logits
Bowman
-0.18
razier
-0.16
trace
-0.15
éĺħ读
-0.14
Trace
-0.14
çĿ£
-0.14
defs
-0.14
æİĽ
-0.14
omentum
-0.14
Scan
-0.13
POSITIVE LOGITS
reporting
0.45
report
0.43
reports
0.42
Reporting
0.39
reports
0.38
-report
0.36
Reporting
0.36
Reports
0.36
report
0.35
REPORT
0.34
Activations Density 0.236%