INDEX
Explanations
the word "have" in various contexts
New Auto-Interp
Negative Logits
TRACE
-0.16
Translated
-0.15
EATURE
-0.14
adil
-0.14
Wort
-0.14
isas
-0.14
yal
-0.14
onal
-0.14
unal
-0.13
álido
-0.13
POSITIVE LOGITS
ogi
0.16
cznie
0.15
Theater
0.15
ysz
0.15
ibur
0.15
$__
0.15
arger
0.15
.ini
0.15
Bucks
0.14
propensity
0.14
Activations Density 0.076%