INDEX
Explanations
mentions of the state code "WA" followed by a high activation value
references to Washington (WA) in various contexts
New Auto-Interp
Negative Logits
erous
-0.75
displayText
-0.73
onial
-0.68
matter
-0.67
sson
-0.66
sis
-0.65
rious
-0.62
ãĤ¨ãĥ«
-0.62
accur
-0.61
ãģĻ
-0.61
POSITIVE LOGITS
VE
1.17
velength
1.11
ILA
1.02
WA
0.98
atche
0.93
apon
0.93
ITH
0.91
UGH
0.90
IVERS
0.87
HAHA
0.84
Activations Density 0.010%