INDEX
Explanations
structured numerical values and references to specific entities or categories
New Auto-Interp
Negative Logits
ovo
-0.17
ingo
-0.16
pong
-0.16
pon
-0.16
tridge
-0.14
ctp
-0.14
تز
-0.14
оÑģлав
-0.14
Bil
-0.14
ÑĢаÑĤи
-0.14
POSITIVE LOGITS
ufe
0.16
cows
0.15
Secondary
0.15
dry
0.15
eniable
0.14
Nit
0.14
baugh
0.14
ewan
0.14
Wichita
0.14
Secondary
0.14
Activations Density 0.030%