INDEX
Explanations
specific locations or objects related to physical space or places
phrases indicating possession or location
New Auto-Interp
Negative Logits
NPR
-0.71
atis
-0.67
inqu
-0.67
araoh
-0.66
official
-0.63
pez
-0.63
ocobo
-0.63
}}}
-0.62
¯
-0.60
#$
-0.60
POSITIVE LOGITS
Grave
0.62
there
0.61
ply
0.60
Lange
0.60
Mineral
0.56
Winged
0.56
disguised
0.56
swast
0.56
renamed
0.54
converter
0.54
Activations Density 0.555%