INDEX
Explanations
themes related to social issues, particularly human rights and identity
New Auto-Interp
Negative Logits
onta
-0.14
ipe
-0.14
thing
-0.14
iated
-0.13
iei
-0.13
elter
-0.13
.community
-0.13
associate
-0.13
ients
-0.13
341
-0.13
POSITIVE LOGITS
evolution
0.24
intersection
0.24
mechanics
0.23
role
0.22
nuts
0.21
meaning
0.21
roles
0.21
intersections
0.20
relationship
0.20
changing
0.19
Activations Density 0.160%