INDEX
Explanations
temporal markers and narrative transitions
New Auto-Interp
Negative Logits
Marino
-0.17
ogui
-0.16
aÅĻ
-0.15
ile
-0.15
ongan
-0.15
rypton
-0.14
èĹ
-0.14
GED
-0.14
GLE
-0.13
_WALL
-0.13
POSITIVE LOGITS
ope
0.16
opes
0.15
sais
0.15
çİ
0.15
adeon
0.14
iro
0.14
imdi
0.14
för
0.14
iyah
0.14
DBG
0.14
Activations Density 0.280%