INDEX
Explanations
names of individuals
references to political figures and significant events
New Auto-Interp
Negative Logits
perse
-0.45
NetMessage
-0.44
OPA
-0.43
laus
-0.43
venge
-0.40
rg
-0.40
Recon
-0.39
semble
-0.39
ramid
-0.38
haar
-0.38
POSITIVE LOGITS
halla
0.51
querque
0.46
isphere
0.42
equivalents
0.42
detract
0.41
outweigh
0.41
lasts
0.40
igible
0.40
oided
0.40
rivals
0.40
Activations Density 2.958%