INDEX
Explanations
references to political figures, particularly the name "Blair"
mentions of the name "Blair."
New Auto-Interp
Negative Logits
Magikarp
-0.96
uality
-0.79
entric
-0.74
ymes
-0.72
Temperature
-0.72
ktop
-0.70
Roaming
-0.70
ENSE
-0.68
Viol
-0.68
ARS
-0.68
POSITIVE LOGITS
Blair
0.99
umenthal
0.85
ites
0.80
anke
0.78
abies
0.77
ite
0.77
icus
0.76
anche
0.74
stein
0.73
ock
0.73
Activations Density 0.004%