INDEX
Explanations
phrases indicating proximity or location
New Auto-Interp
Negative Logits
ì°©
-0.16
вано
-0.15
insk
-0.15
PING
-0.15
onis
-0.14
δÏģο
-0.14
orsk
-0.14
ниже
-0.13
ãĥ¥
-0.13
yo
-0.13
POSITIVE LOGITS
lessly
0.19
s
0.17
lined
0.17
lier
0.16
lug
0.15
ish
0.15
side
0.14
ime
0.14
ä¹İ
0.14
291
0.14
Activations Density 0.022%