INDEX
Explanations
names related to people and places
specific names and terms related to individuals or entities
New Auto-Interp
Negative Logits
Eck
-0.62
cod
-0.59
Ness
-0.58
Wolf
-0.56
Cath
-0.56
Comet
-0.56
Eleanor
-0.55
Beck
-0.55
WW
-0.54
homophobic
-0.54
POSITIVE LOGITS
obin
0.95
nir
0.92
enei
0.91
chev
0.90
ogg
0.87
arij
0.85
arov
0.80
endar
0.80
afi
0.79
src
0.76
Activations Density 0.066%