INDEX
Explanations
phrases indicating a range or selection of options
New Auto-Interp
Negative Logits
anderen
-0.16
åı¦å¤ĸ
-0.15
åħ¶ä»ĸ
-0.15
OTHER
-0.15
nier
-0.15
altri
-0.14
other
-0.14
åı¦ä¸Ģ
-0.14
">ÃĹ</
-0.14
OTHER
-0.14
POSITIVE LOGITS
simple
0.30
humble
0.26
smallest
0.25
simple
0.24
simples
0.24
ç®Ģåįķ
0.23
simplest
0.23
small
0.22
basic
0.21
inception
0.21
Activations Density 0.070%