INDEX
Explanations
Italian words or phrases
occurrences of the substring "lla" within words, particularly those ending or containing "lla"
New Auto-Interp
Negative Logits
citiz
-0.79
unden
-0.77
lay
-0.74
states
-0.70
lif
-0.69
States
-0.69
rals
-0.68
liners
-0.68
balanced
-0.67
lied
-0.65
POSITIVE LOGITS
uthor
0.96
ppo
0.93
ppa
0.91
lla
0.83
udic
0.81
ppe
0.79
ignt
0.79
ength
0.77
emon
0.76
Rosa
0.74
Activations Density 0.020%