INDEX
Explanations
phrases indicating a return or restoration to a previous state
New Auto-Interp
Negative Logits
Various
-0.14
ütün
-0.14
ibur
-0.14
anine
-0.14
_pv
-0.14
zw
-0.14
ledon
-0.14
ÑĢел
-0.13
ÑĢазлиÑĩнÑĭÑħ
-0.13
onna
-0.13
POSITIVE LOGITS
normal
0.34
basics
0.31
normal
0.28
sender
0.25
-normal
0.25
roots
0.24
Basics
0.23
fold
0.23
Normal
0.23
NORMAL
0.23
Activations Density 0.103%