INDEX
Explanations
the word 'meaning', sometimes with related words
New Auto-Interp
Negative Logits
.
-0.85
,
-0.74
vs
-0.71
!
-0.68
↵↵
-0.67
-
-0.67
/
-0.66
-
-0.65
(
-0.65
(
-0.60
POSITIVE LOGITS
NUMX
1.44
―――――
1.39
myſelf
1.34
itſelf
1.28
iſt
1.23
་་
1.20
ſind
1.18
ſelf
1.17
leſs
1.16
Egli
1.16
Activations Density 1.880%