INDEX
Explanations
mentions of specific individuals, specifically "Silva" and "Medina."
New Auto-Interp
Negative Logits
rians
-0.79
fare
-0.76
neys
-0.75
cause
-0.72
rium
-0.72
MY
-0.72
acent
-0.72
ritic
-0.70
printed
-0.69
ILY
-0.68
POSITIVE LOGITS
iola
0.93
Paulo
0.84
Silva
0.79
nets
0.72
otti
0.71
Jiu
0.70
Rafael
0.70
Alvarez
0.69
Ventura
0.69
imar
0.69
Activations Density 0.004%