INDEX
Explanations
words related to themes in art and education that involve social or political issues
New Auto-Interp
Negative Logits
es
-0.16
Irving
-0.14
ables
-0.14
affen
-0.14
orWhere
-0.14
lep
-0.14
ive
-0.14
Hughes
-0.13
ees
-0.13
ad
-0.13
POSITIVE LOGITS
ize
0.25
-out
0.21
iate
0.20
ted
0.19
-up
0.19
ify
0.19
lại
0.18
elize
0.18
out
0.18
IZE
0.16
Activations Density 0.363%