INDEX
Explanations
themes related to social and political injustice
New Auto-Interp
Negative Logits
ioxide
-0.16
ær
-0.15
adera
-0.14
ekl
-0.14
FromBody
-0.14
reducers
-0.14
.once
-0.14
Ñĥгод
-0.14
ãģĿãģĵ
-0.13
relude
-0.13
POSITIVE LOGITS
although
0.73
despite
0.70
while
0.61
although
0.60
though
0.59
even
0.59
Although
0.56
èϽçĦ¶
0.54
尽管
0.52
Although
0.51
Activations Density 0.410%