INDEX
Explanations
expressions related to transformation and recovery
New Auto-Interp
Negative Logits
OTA
-0.16
iq
-0.16
dip
-0.14
ota
-0.14
rics
-0.14
ɵ
-0.14
опÑĢи
-0.14
zier
-0.14
çħ§
-0.14
Surprise
-0.13
POSITIVE LOGITS
turn
0.44
turned
0.42
turned
0.40
turn
0.40
Turn
0.38
-turn
0.38
TURN
0.36
.turn
0.35
around
0.34
turns
0.33
Activations Density 0.025%