INDEX
Explanations
proper names
proper names and specific identifiers related to individuals
New Auto-Interp
Negative Logits
Skydragon
-0.83
ACTION
-0.77
acceleration
-0.71
simulator
-0.70
dictators
-0.70
Communists
-0.69
deflation
-0.69
downgrade
-0.68
sidx
-0.68
technical
-0.67
POSITIVE LOGITS
ilyn
1.59
annah
1.49
helle
1.45
andra
1.44
yna
1.43
abeth
1.42
lene
1.42
annie
1.41
ricia
1.41
ilee
1.41
Activations Density 0.263%