INDEX
Explanations
conversational questions and phrases about opinion and agreement
New Auto-Interp
Negative Logits
vel
-0.16
athom
-0.15
bush
-0.15
Institutes
-0.15
xbd
-0.14
hora
-0.14
opsis
-0.14
extremely
-0.14
ZN
-0.14
Bush
-0.14
POSITIVE LOGITS
regon
0.16
sert
0.16
Hend
0.16
784
0.15
628
0.15
Æ°á»Łng
0.14
jamin
0.14
hourly
0.14
EXEMPLARY
0.14
оÑĢдин
0.14
Activations Density 0.003%