INDEX
Explanations
names of people or entities
proper nouns, particularly names associated with individuals or entities
New Auto-Interp
Negative Logits
rophe
-0.59
duration
-0.58
Cooldown
-0.57
erver
-0.57
holiest
-0.55
width
-0.55
mileage
-0.55
blat
-0.54
SOURCE
-0.53
lid
-0.53
POSITIVE LOGITS
neys
0.96
ernaut
0.83
ilee
0.81
usalem
0.79
etus
0.77
okia
0.76
isco
0.75
iard
0.75
iffe
0.74
eki
0.73
Activations Density 0.086%