INDEX
Explanations
questions that challenge opinions or ideas
New Auto-Interp
Negative Logits
erland
-0.09
AYER
-0.07
atrix
-0.07
ocache
-0.06
lei
-0.06
arna
-0.06
pragma
-0.06
βο
-0.06
dma
-0.06
amburger
-0.06
POSITIVE LOGITS
à¤ĩतन
0.08
Whenever
0.07
bother
0.07
whenever
0.07
insists
0.06
ãģĵãĤĵãģª
0.06
CAPE
0.06
so
0.06
tão
0.06
Whenever
0.06
Activations Density 0.019%