INDEX
Explanations
conjunctions and transition phrases that connect ideas or arguments
New Auto-Interp
Negative Logits
lei
-0.17
visor
-0.15
Ïħγ
-0.15
stvo
-0.15
foy
-0.14
élé
-0.13
mutable
-0.13
ele
-0.13
ritt
-0.13
ahu
-0.13
POSITIVE LOGITS
appe
0.15
ellig
0.14
ilia
0.14
nock
0.14
famously
0.14
recap
0.14
odi
0.13
plat
0.13
lok
0.13
elia
0.13
Activations Density 0.001%