INDEX
Explanations
references to a specific person named Perez
mentions of the name "Perez."
New Auto-Interp
Negative Logits
ories
-0.95
ivities
-0.86
liness
-0.83
liest
-0.79
RAFT
-0.77
orically
-0.74
ocaust
-0.72
spring
-0.72
iveness
-0.70
Columb
-0.70
POSITIVE LOGITS
Perez
0.88
ocalypse
0.81
irez
0.80
ktop
0.79
icer
0.78
ilon
0.77
Maker
0.73
achy
0.72
tti
0.72
Hilton
0.70
Activations Density 0.007%