INDEX
Explanations
references to people's names, particularly the name "Anthony."
references to the name "Anthony."
New Auto-Interp
Negative Logits
wagen
-0.89
å§«
-0.83
sbm
-0.81
dp
-0.81
EEK
-0.80
ãĥķ
-0.79
dog
-0.77
spring
-0.77
nir
-0.76
VOL
-0.74
POSITIVE LOGITS
Weiner
1.01
Anthony
0.87
Bour
0.86
Hopkins
0.80
Centauri
0.79
Russo
0.75
anton
0.74
Smith
0.71
Anthony
0.71
Joshua
0.70
Activations Density 0.012%