INDEX
Explanations
the beginning of the document and introductory phrases
New Auto-Interp
Negative Logits
Jereo
-0.55
Giuli
-0.48
isticas
-0.47
̀i
-0.45
Personendaten
-0.45
Hygge
-0.44
Flatten
-0.44
Gaia
-0.44
✭✭
-0.43
shared
-0.43
POSITIVE LOGITS
!*\
0.60
resourceCulture
0.57
tvguidetime
0.57
muualla
0.56
хьтан
0.54
rungsseite
0.53
Waray
0.53
CWE
0.53
متحده
0.53
脚注の使い方
0.53
Activations Density 0.005%