INDEX
Explanations
instances of the word "complaint" within text
mentions of formal complaints
New Auto-Interp
Negative Logits
aughs
-0.80
itals
-0.78
raham
-0.77
mers
-0.75
eton
-0.71
artifacts
-0.70
lasses
-0.69
ocrates
-0.68
bern
-0.67
ooks
-0.67
POSITIVE LOGITS
complaints
1.06
complaint
1.00
alleging
0.96
filed
0.85
lodged
0.81
alleges
0.78
against
0.77
isance
0.77
leveled
0.76
naires
0.75
Activations Density 0.020%