INDEX
Explanations
words related to proximity or accompaniment
New Auto-Interp
Negative Logits
ek
-0.15
ummer
-0.14
odal
-0.13
oris
-0.13
anh
-0.13
eros
-0.13
agon
-0.13
_ASSUME
-0.13
utations
-0.13
جز
-0.12
POSITIVE LOGITS
fra
0.18
ward
0.16
LastError
0.16
wards
0.15
neath
0.15
the
0.14
\/\/
0.14
füh
0.14
estar
0.14
ought
0.14
Activations Density 0.146%