INDEX
Explanations
phrases that solicit feedback or opinions from the audience
New Auto-Interp
Negative Logits
ike
-0.15
unk
-0.15
ia
-0.14
Manus
-0.14
stood
-0.14
qrt
-0.14
éŃ
-0.13
ucken
-0.13
unkt
-0.13
avid
-0.13
POSITIVE LOGITS
logy
0.16
ós
0.15
rif
0.15
oso
0.15
.rawValue
0.14
iÄĩ
0.14
Duy
0.14
ÐĿаÑģ
0.13
ãģ¡ãģ¯
0.13
Drv
0.13
Activations Density 0.059%