INDEX
Explanations
words related to transitions or changes in context
New Auto-Interp
Negative Logits
ĵn
-0.15
rack
-0.15
oment
-0.14
owing
-0.14
untu
-0.14
سÙĩ
-0.13
ÏĨαÏģ
-0.13
ady
-0.13
stadt
-0.13
неÑģÑĤи
-0.13
POSITIVE LOGITS
gon
0.19
pad
0.17
transitions
0.15
anan
0.15
obra
0.14
transition
0.14
Ñħод
0.14
ives
0.14
.transition
0.14
uan
0.14
Activations Density 0.007%