INDEX
Explanations
phrases related to beginnings or starting points
New Auto-Interp
Negative Logits
ague
-0.15
uida
-0.14
acute
-0.14
ево
-0.14
½
-0.14
uil
-0.14
uty
-0.14
inez
-0.14
ãĥ³ãĥģ
-0.13
utch
-0.13
POSITIVE LOGITS
ge
0.32
ges
0.31
gew
0.28
gesch
0.28
ged
0.26
gel
0.26
geb
0.26
gest
0.26
ger
0.25
z
0.24
Activations Density 0.012%