INDEX
Explanations
references to calculations and numerical data
New Auto-Interp
Negative Logits
azu
-0.16
chal
-0.14
agn
-0.14
nost
-0.14
ghi
-0.14
chl
-0.13
chner
-0.13
tol
-0.13
Herc
-0.13
chi
-0.13
POSITIVE LOGITS
éĢ
0.15
uts
0.15
klä
0.14
inkel
0.14
ibir
0.14
idlo
0.14
ÙĪØ«
0.14
меÑĤалли
0.13
ÑĤÑĥÑĢ
0.13
ajÃŃ
0.13
Activations Density 0.003%