INDEX
Explanations
phrases that express relative measurements of size or quantity
New Auto-Interp
Negative Logits
ome
-0.16
urette
-0.15
aman
-0.15
igon
-0.15
igu
-0.15
ipsis
-0.15
ниÑĨ
-0.14
ific
-0.14
incinn
-0.14
aar
-0.14
POSITIVE LOGITS
/high
0.28
(er
0.24
-tech
0.24
lander
0.22
-rise
0.22
landers
0.21
-profile
0.18
tide
0.18
school
0.18
-powered
0.18
Activations Density 0.041%