INDEX
Explanations
words related to identification and categorization
New Auto-Interp
Negative Logits
ÏĦÏģι
-0.15
stockholm
-0.15
stripe
-0.14
ittel
-0.14
CRET
-0.13
ackbar
-0.13
edis
-0.13
or
-0.13
elian
-0.13
arhus
-0.13
POSITIVE LOGITS
soar
0.23
climb
0.22
ascend
0.20
dive
0.19
plunge
0.19
leap
0.16
jump
0.16
Builder
0.16
ascent
0.16
ساز
0.15
Activations Density 0.027%