INDEX
Explanations
capitalized words or acronyms related to military or specific geographical locations
instances of single-letter words or abbreviations
New Auto-Interp
Negative Logits
Nich
-0.82
Sally
-0.81
Wyn
-0.80
Whites
-0.78
Molly
-0.78
Mania
-0.77
Nich
-0.72
Wid
-0.71
Username
-0.71
Nationwide
-0.71
POSITIVE LOGITS
ar
1.64
AR
1.50
ars
1.44
arp
1.28
aris
1.25
arak
1.23
ari
1.22
aro
1.20
arat
1.20
ared
1.18
Activations Density 0.149%