INDEX
Explanations
references to influential individuals or significant public figures
New Auto-Interp
Negative Logits
okemon
-0.71
Ples
-0.67
Operation
-0.67
ilation
-0.67
oise
-0.65
prus
-0.63
contrace
-0.62
mble
-0.62
IFE
-0.61
Congo
-0.61
POSITIVE LOGITS
head
1.16
heads
1.02
prominently
0.97
skating
0.89
aces
0.86
enance
0.82
aced
0.77
heading
0.76
polit
0.75
hood
0.74
Activations Density 0.012%