INDEX
Explanations
references to U.S. states and their respective legislative or social actions
New Auto-Interp
Negative Logits
pit
-0.74
arten
-0.70
Skies
-0.70
IFE
-0.68
undy
-0.68
izons
-0.67
raine
-0.67
arden
-0.66
unal
-0.66
awan
-0.65
POSITIVE LOGITS
encrypt
0.74
tacit
0.72
scrut
0.72
deft
0.69
quietly
0.68
detected
0.67
attributes
0.67
hiba
0.67
brisk
0.67
invented
0.66
Activations Density 0.237%