INDEX
Explanations
the name "Iz" with varying activations
the word "Iz" and its variations in different contexts
New Auto-Interp
Negative Logits
quartered
-0.83
Norn
-0.76
srf
-0.70
whistle
-0.70
riott
-0.68
bably
-0.65
shroud
-0.65
¥ŀ
-0.65
satell
-0.64
limp
-0.62
POSITIVE LOGITS
arro
1.37
iz
1.18
ombie
0.99
azz
0.96
abel
0.93
awa
0.92
iaz
0.91
ymes
0.91
azo
0.91
utsu
0.91
Activations Density 0.008%