INDEX
Explanations
words related to accusations or claims
references to allegations
New Auto-Interp
Negative Logits
spl
-0.75
oppy
-0.69
pper
-0.65
tz
-0.65
owa
-0.65
ggies
-0.64
onen
-0.64
busiest
-0.62
Interstitial
-0.62
Cav
-0.62
POSITIVE LOGITS
allegations
1.27
accusations
1.05
allegation
1.00
alleging
0.93
accusation
0.91
alleges
0.87
accusing
0.84
accus
0.83
accuser
0.82
riott
0.80
Activations Density 0.018%