INDEX
Explanations
questions asking for opinions or thoughts on various topics or articles
questions and phrases related to opinions or thoughts
New Auto-Interp
Negative Logits
vity
-0.82
ruit
-0.78
ante
-0.77
ulo
-0.70
arms
-0.68
abo
-0.68
dig
-0.67
rote
-0.67
ynthesis
-0.66
ring
-0.66
POSITIVE LOGITS
whether
0.75
76561
0.72
behalf
0.71
homosexuality
0.71
Rating
0.70
unres
0.69
respecting
0.69
awed
0.67
aloud
0.67
regard
0.66
Activations Density 0.124%