INDEX
Explanations
references to global or world contexts
New Auto-Interp
Negative Logits
zelf
-0.19
ety
-0.17
cht
-0.17
imson
-0.16
elor
-0.16
aign
-0.15
ToWorld
-0.15
ersen
-0.15
elow
-0.14
à¹Ģà¸ģล
-0.14
POSITIVE LOGITS
Wide
0.34
-wide
0.33
Wide
0.30
wide
0.30
liness
0.29
wide
0.29
views
0.26
-ren
0.26
iyon
0.19
-class
0.19
Activations Density 0.086%