INDEX
Explanations
mentions of specific names related to politics, possibly in conflict regions
names of geographic regions, organizations, or prominent individuals associated with those areas
New Auto-Interp
Negative Logits
ENCY
-0.74
Doodle
-0.67
monkey
-0.66
ciplinary
-0.63
Korean
-0.63
Scorpion
-0.63
amus
-0.62
é¾įå¥ij士
-0.62
pies
-0.61
Korea
-0.61
POSITIVE LOGITS
rina
0.85
dar
0.82
azard
0.81
rd
0.80
bian
0.79
raviolet
0.79
edly
0.78
rons
0.77
ril
0.75
iated
0.75
Activations Density 0.041%