INDEX
Explanations
numerical values and their equivalents in various contexts
New Auto-Interp
Negative Logits
adele
-0.15
neutral
-0.15
stvo
-0.14
ruc
-0.14
anco
-0.14
captures
-0.14
ansen
-0.14
ĥĿ
-0.14
agger
-0.14
lette
-0.14
POSITIVE LOGITS
uku
0.15
FRING
0.15
-equ
0.15
rement
0.14
书记
0.14
erd
0.14
âk
0.14
reas
0.14
дам
0.13
CCR
0.13
Activations Density 0.157%