INDEX
Explanations
mentions of Afghanistan and Iraq
New Auto-Interp
Negative Logits
iece
-0.81
anni
-0.78
oxide
-0.77
Pwr
-0.76
20439
-0.74
arse
-0.73
uble
-0.73
xual
-0.70
pose
-0.69
monton
-0.67
POSITIVE LOGITS
Gutenberg
0.69
licens
0.68
DRM
0.68
doors
0.67
FreeBSD
0.67
eBay
0.66
patents
0.65
ARDIS
0.64
coron
0.64
entrants
0.64
Activations Density 0.051%