INDEX
Explanations
references to significant violent events or incidents
New Auto-Interp
Negative Logits
sized
-0.68
congress
-0.67
thumbnail
-0.66
Pand
-0.65
figure
-0.65
caf
-0.64
likeness
-0.63
match
-0.63
Bund
-0.62
PATH
-0.62
POSITIVE LOGITS
hov
1.07
hi
0.86
staking
0.84
stress
0.78
peria
0.75
orthodox
0.75
iko
0.74
fter
0.74
pin
0.72
ounding
0.72
Activations Density 0.216%