INDEX
Explanations
phrases related to authority and orders
references to immigration and legal status issues
New Auto-Interp
Negative Logits
Alas
-0.75
etheless
-0.75
Adds
-0.67
"{-0.64
roxy
-0.63
Often
-0.60
Verge
-0.60
APS
-0.60
Contrast
-0.60
eagerly
-0.59
POSITIVE LOGITS
gotta
1.39
ain
1.21
gonna
1.08
're
1.06
wanna
1.06
got
1.01
fuckin
0.95
owe
0.89
deserve
0.87
ought
0.87
Activations Density 0.304%