INDEX
Explanations
references to government or authority figures, particularly those related to oversight and investigations
occurrences of the word "inspector."
New Auto-Interp
Negative Logits
theless
-1.01
birth
-0.79
eda
-0.70
erry
-0.70
talk
-0.69
words
-0.68
asted
-0.65
eton
-0.64
demand
-0.64
esville
-0.63
POSITIVE LOGITS
Gadget
1.09
atis
0.85
Reviewer
0.85
Inspector
0.83
Spac
0.82
inspector
0.82
itle
0.76
orate
0.76
atus
0.73
ror
0.72
Activations Density 0.011%