INDEX
Explanations
Italian words or phrases
New Auto-Interp
Negative Logits
lasses
-0.92
ipeg
-0.88
manship
-0.85
rah
-0.83
\\\\\\\\
-0.82
rill
-0.81
lessness
-0.80
ridges
-0.78
ingham
-0.77
ramer
-0.76
POSITIVE LOGITS
zzi
1.12
Rossi
0.98
ucci
0.95
Galile
0.88
zzo
0.86
Giul
0.85
otti
0.85
etta
0.84
Luigi
0.84
Giovanni
0.84
Activations Density 1.156%