INDEX
Explanations
phrases indicating quantities or measurements related to topics
New Auto-Interp
Negative Logits
antan
-0.18
&a
-0.17
onu
-0.16
ait
-0.15
ied
-0.15
uyết
-0.15
γÏĮ
-0.14
emmel
-0.14
elf
-0.14
frey
-0.13
POSITIVE LOGITS
nowhere
0.35
bounds
0.26
sight
0.24
reach
0.23
necessity
0.21
town
0.21
Bounds
0.20
0.20
sheer
0.19
town
0.19
Activations Density 0.055%