INDEX
Explanations
phrases related to starting or initiating actions and events
New Auto-Interp
Negative Logits
Kaynak
-0.16
azen
-0.15
/Typography
-0.14
Ñĸдно
-0.14
hed
-0.14
atz
-0.14
imers
-0.14
alon
-0.14
ufen
-0.14
iyon
-0.13
POSITIVE LOGITS
start
0.31
boxing
0.30
starter
0.30
ass
0.29
ass
0.28
started
0.28
-start
0.27
butt
0.27
off
0.26
-off
0.26
Activations Density 0.010%