INDEX
Explanations
proper nouns, particularly names of people and organizations
instances of the word "add" and its variations in the text
New Auto-Interp
Negative Logits
GGGGGGGG
-0.64
departure
-0.60
STAT
-0.59
compare
-0.58
votes
-0.58
Wings
-0.57
SPONSORED
-0.57
wave
-0.56
··
-0.56
COLOR
-0.56
POSITIVE LOGITS
itionally
1.13
itional
1.12
icted
1.05
eus
1.01
icts
0.99
ishly
0.99
iction
0.97
erella
0.97
sworth
0.96
ington
0.96
Activations Density 0.034%