INDEX
Explanations
phrases indicating direction or progression towards a goal or concept
New Auto-Interp
Negative Logits
/w
-0.16
ek
-0.15
imer
-0.14
ares
-0.14
ute
-0.14
IConfiguration
-0.13
arity
-0.13
ÅĻÃŃklad
-0.13
_NEAR
-0.13
losing
-0.13
POSITIVE LOGITS
GGLE
0.18
ies
0.17
toward
0.17
Tow
0.16
/from
0.16
towards
0.16
ement
0.16
ships
0.15
esch
0.15
sWith
0.14
Activations Density 0.027%