INDEX
Explanations
sentences related to social commentary or critiques of society
New Auto-Interp
Negative Logits
onal
-0.16
Broad
-0.16
earch
-0.15
Randall
-0.15
broad
-0.15
Grü
-0.14
GenerationType
-0.14
Colum
-0.14
acer
-0.14
Broad
-0.13
POSITIVE LOGITS
REP
0.16
971
0.15
672
0.15
istrovstvÃŃ
0.14
orz
0.14
cker
0.14
quez
0.14
472
0.14
/inet
0.13
ÐŁÐļ
0.13
Activations Density 0.551%