INDEX
Explanations
proper names and related terms
elements of names or phrases associated with specific individuals or entities
New Auto-Interp
Negative Logits
GOODMAN
-0.81
whales
-0.68
Nile
-0.65
GGGG
-0.62
trainers
-0.62
underwater
-0.60
disapp
-0.58
faint
-0.57
Predator
-0.56
caucuses
-0.56
POSITIVE LOGITS
uble
0.90
schild
0.90
enegger
0.85
kov
0.82
ramer
0.80
vere
0.79
inger
0.79
berger
0.77
lich
0.77
feld
0.76
Activations Density 0.061%