INDEX
Explanations
references to information inside brackets or encoded data
New Auto-Interp
Negative Logits
Eighth
-0.19
eighth
-0.18
enville
-0.16
imed
-0.16
fal
-0.15
redient
-0.15
اÙĨÙĩ
-0.15
ibri
-0.14
gain
-0.14
}};↵
-0.14
POSITIVE LOGITS
19
0.63
18
0.57
nineteen
0.43
nineteenth
0.39
eighteen
0.38
019
0.37
018
0.33
Û±Û¹
0.32
ninete
0.32
Û±Û¸
0.32
Activations Density 0.069%