INDEX
Explanations
phrases that indicate frequency or repetition
New Auto-Interp
Negative Logits
vala
-0.16
zych
-0.15
omor
-0.15
EFAULT
-0.15
ioxide
-0.14
ropa
-0.14
rief
-0.14
üt
-0.14
елик
-0.14
HID
-0.14
POSITIVE LOGITS
so
0.22
blue
0.22
Blue
0.21
once
0.21
now
0.19
/blue
0.18
tanto
0.18
Blue
0.17
oc
0.17
Mann
0.17
Activations Density 0.022%