INDEX
Explanations
references to societal issues and collective experiences
New Auto-Interp
Negative Logits
oria
-0.15
lical
-0.15
eden
-0.15
arius
-0.14
quelle
-0.14
CTOR
-0.14
ÃĹ
-0.14
èµĦæĸĻ
-0.14
ebi
-0.14
imo
-0.14
POSITIVE LOGITS
thing
0.26
country
0.22
Country
0.18
thing
0.18
Thing
0.17
/moment
0.16
here
0.16
moment
0.16
industry
0.16
country
0.16
Activations Density 0.131%