INDEX
Explanations
phrases related to social commentary or criticism
New Auto-Interp
Negative Logits
uum
-0.69
Airl
-0.68
ributed
-0.68
ascript
-0.67
icidal
-0.66
ieu
-0.65
eyed
-0.65
imb
-0.63
Institution
-0.61
ashed
-0.61
POSITIVE LOGITS
adays
1.40
here
0.96
belongs
0.79
comes
0.73
nir
0.73
confronts
0.72
resides
0.71
defunct
0.71
behold
0.71
awaits
0.70
Activations Density 0.049%