INDEX
Explanations
words related to male names or entities containing "Wh"
repeated instances of the character sequence "Wh"
New Auto-Interp
Negative Logits
CVE
-0.92
actionDate
-0.82
alian
-0.79
coded
-0.73
appendix
-0.73
CTR
-0.66
fecture
-0.66
а
-0.63
hap
-0.62
ANI
-0.61
POSITIVE LOGITS
Wh
3.70
Wh
2.49
wh
1.97
wh
1.76
WH
1.65
Whale
1.52
Whe
1.43
Whis
1.39
WH
1.29
Whit
1.26
Activations Density 0.013%