INDEX
Explanations
occurrences of the letter 'w' or related pronouns starting with 'w'
New Auto-Interp
Negative Logits
Mejía
-0.47
wzglę
-0.41
ilustracja
-0.41
Thier
-0.41
فرق
-0.41
lege
-0.40
niech
-0.39
Diez
-0.39
הכל
-0.38
bowiem
-0.38
POSITIVE LOGITS
w
2.69
W
0.83
wl
0.78
ww
0.71
wd
0.68
v
0.66
wk
0.66
wt
0.66
wad
0.65
iw
0.64
Activations Density 0.082%