INDEX
Explanations
proper nouns, particularly the name "Greg."
references to the name "Greg."
New Auto-Interp
Negative Logits
lly
-0.80
xual
-0.78
ties
-0.73
making
-0.69
xus
-0.68
cffffcc
-0.62
stew
-0.59
llah
-0.59
LOAD
-0.58
geon
-0.58
POSITIVE LOGITS
orio
1.46
orian
1.40
orius
1.35
arious
1.35
ory
1.15
orians
1.05
orie
0.98
or
0.97
ories
0.96
ersen
0.94
Activations Density 0.029%