INDEX
Explanations
names related to a specific cultural or religious group
mentions of specific religious or ethnic groups, particularly Shiite Muslims
New Auto-Interp
Negative Logits
fulness
-0.79
ividual
-0.70
fully
-0.69
theless
-0.69
代
-0.67
Zup
-0.66
uyomi
-0.66
Zac
-0.65
ãĥ¬
-0.65
Norn
-0.65
POSITIVE LOGITS
¯¯¯¯
1.13
STON
1.07
keye
0.96
eworks
0.93
terness
0.92
¯¯¯¯¯¯¯¯
0.91
apers
0.90
neys
0.88
sts
0.83
hiba
0.83
Activations Density 0.025%