INDEX
Explanations
mentions of a person called Wong
mentions of the name "Wong."
New Auto-Interp
Negative Logits
phis
-0.78
ctic
-0.70
selves
-0.69
commissioner
-0.67
phas
-0.66
Commissioner
-0.66
getic
-0.65
ICAN
-0.65
________________________________
-0.64
rian
-0.62
POSITIVE LOGITS
Wong
1.22
nood
1.02
Chun
0.95
Nguyen
0.91
Fei
0.90
awei
0.88
enhagen
0.87
aii
0.85
Zhu
0.84
noodles
0.84
Activations Density 0.004%