INDEX
Explanations
instances of the word "addition" indicating supplemental information or further explanations
New Auto-Interp
Negative Logits
atten
-0.16
ivals
-0.15
گرÛĮ
-0.14
aju
-0.14
zenia
-0.14
apsed
-0.14
ë¹Ī
-0.14
åħĴ
-0.13
à¥įबर
-0.13
ãģªãĤĵ
-0.13
POSITIVE LOGITS
tion
0.27
nal
0.22
ition
0.20
ally
0.19
phia
0.18
eel
0.18
iton
0.16
tal
0.16
ordinary
0.16
ormal
0.16
Activations Density 0.020%