INDEX
Explanations
mentions of individuals with their associated metrics or measures
New Auto-Interp
Negative Logits
ensa
-0.19
arkan
-0.17
ax
-0.17
emo
-0.17
yst
-0.17
/fw
-0.16
isma
-0.16
imple
-0.15
aker
-0.15
agg
-0.15
POSITIVE LOGITS
ee
0.19
roz
0.19
ched
0.19
cc
0.18
eller
0.18
h
0.18
cd
0.17
#ad
0.17
HAPP
0.16
ab
0.16
Activations Density 0.033%