INDEX
Explanations
various data points, particularly numbers and their associated contexts
New Auto-Interp
Negative Logits
elho
-0.19
usra
-0.17
habi
-0.17
^{°}-0.15
afari
-0.15
hab
-0.15
URRE
-0.15
Kenn
-0.14
urette
-0.14
ROTO
-0.14
POSITIVE LOGITS
Ñģлов
0.16
ha
0.16
ÏĦά
0.15
315
0.15
Jer
0.15
ordinate
0.15
Jar
0.15
jar
0.14
éħį
0.14
lear
0.14
Activations Density 0.002%