INDEX
Explanations
proper names (specifically the name "Carlos")
mentions of individuals named Carlos
New Auto-Interp
Negative Logits
marked
-0.91
ihar
-0.77
mble
-0.77
illing
-0.76
fare
-0.75
raq
-0.73
payers
-0.70
acious
-0.70
acular
-0.68
rolog
-0.68
POSITIVE LOGITS
Santana
0.95
Carlos
0.85
Niño
0.84
cano
0.83
otta
0.83
Gutierrez
0.82
Garcia
0.80
ques
0.79
imar
0.78
Martinez
0.77
Activations Density 0.017%