INDEX
Explanations
specific names related to nuclear plants or incidents
references to specific places and individuals associated with significant events
New Auto-Interp
Negative Logits
nesday
-0.93
aints
-0.90
abies
-0.87
ace
-0.80
cific
-0.77
mingham
-0.76
icago
-0.73
ounter
-0.72
adders
-0.72
uration
-0.71
POSITIVE LOGITS
Dai
1.08
conom
0.81
VERTISEMENT
0.80
forth
0.76
Flavoring
0.76
ept
0.74
PUT
0.74
ãĥ´
0.74
女
0.73
Mub
0.72
Activations Density 0.021%