INDEX
Explanations
Proper nouns - specifically those related to names of places or individuals
mentions of specific names or terms, particularly those ending in "ona."
New Auto-Interp
Negative Logits
ulative
-0.89
leness
-0.81
regular
-0.77
sworth
-0.76
liners
-0.76
sg
-0.74
lings
-0.74
shaw
-0.70
best
-0.70
arcity
-0.70
POSITIVE LOGITS
issance
1.37
isance
1.11
velength
1.07
ire
0.85
nect
0.84
ignt
0.82
ftime
0.81
uthor
0.80
onite
0.80
eus
0.79
Activations Density 0.014%