INDEX
Explanations
political entities and activities
New Auto-Interp
Negative Logits
sters
-0.81
nai
-0.77
thy
-0.76
Adults
-0.74
SPONSORED
-0.74
Mothers
-0.72
gdala
-0.71
Person
-0.71
nikov
-0.70
wives
-0.69
POSITIVE LOGITS
rearr
0.97
reuse
0.96
reorgan
0.95
delet
0.93
rewrite
0.93
recycle
0.93
redund
0.92
restruct
0.92
decom
0.92
decomp
0.91
Activations Density 0.220%