INDEX
Explanations
the name "Alex"
mentions of the name "Alex."
New Auto-Interp
Negative Logits
fare
-0.69
office
-0.68
recy
-0.66
enegger
-0.66
purpose
-0.65
eling
-0.64
bred
-0.63
stakes
-0.62
unsub
-0.62
ATURE
-0.61
POSITIVE LOGITS
ei
0.93
opoulos
0.91
inia
0.91
iev
0.89
andra
0.84
ulia
0.82
orean
0.80
arov
0.79
anian
0.78
iol
0.78
Activations Density 0.016%