INDEX
Explanations
people's names
names of individuals or entities
New Auto-Interp
Negative Logits
thereto
-0.66
/-
-0.64
#$
-0.63
mov
-0.62
xual
-0.59
respectively
-0.59
Magikarp
-0.59
initials
-0.58
mble
-0.58
stood
-0.58
POSITIVE LOGITS
endish
0.80
itably
0.73
hower
0.72
ford
0.69
kefeller
0.69
ston
0.69
Stain
0.69
sonian
0.68
achusetts
0.67
anyahu
0.67
Activations Density 0.360%