INDEX
Explanations
references to time, particularly the past
New Auto-Interp
Negative Logits
fav
-0.18
ört
-0.16
jav
-0.15
iasi
-0.15
Dale
-0.14
ouns
-0.14
annies
-0.14
Lair
-0.14
nerg
-0.14
vore
-0.14
POSITIVE LOGITS
alat
0.17
eto
0.16
redis
0.16
exe
0.15
arro
0.15
оÑģÑĮ
0.15
Fach
0.14
çIJ³
0.14
zbo
0.14
INY
0.14
Activations Density 0.203%