INDEX
Explanations
phrases indicating distance or extent
New Auto-Interp
Negative Logits
ollo
-0.17
ãģĴ
-0.16
erson
-0.16
anan
-0.15
svc
-0.15
noch
-0.14
inger
-0.14
obus
-0.14
à¸Ńà¸ļ
-0.14
ador
-0.14
POSITIVE LOGITS
concerned
0.22
FETCH
0.20
goes
0.18
apt
0.18
fetched
0.18
Concern
0.16
go
0.15
-go
0.15
/***/
0.15
eo
0.15
Activations Density 0.005%