INDEX
Explanations
numerical identifiers or abbreviations, possibly related to names or categories
New Auto-Interp
Negative Logits
utschen
-0.15
oine
-0.15
_TER
-0.15
ainment
-0.15
opensource
-0.15
oire
-0.14
ddit
-0.14
ONGL
-0.14
ÅĻi
-0.14
ÃŃnÄĽ
-0.14
POSITIVE LOGITS
jet
0.18
zn
0.16
istro
0.15
inkle
0.14
jang
0.14
ket
0.14
пÑĢави
0.14
eum
0.13
UBL
0.13
ez
0.13
Activations Density 0.198%