INDEX
Explanations
references to a specific name "Tommy"
mentions of the name "Tommy."
New Auto-Interp
Negative Logits
iard
-0.83
places
-0.81
ortment
-0.80
agher
-0.79
ividual
-0.78
elaide
-0.76
ership
-0.76
idences
-0.76
ences
-0.76
ered
-0.75
POSITIVE LOGITS
Hil
0.87
Trash
0.84
Robinson
0.83
oshenko
0.81
Bone
0.77
Caldwell
0.76
kn
0.73
DeV
0.72
Tune
0.71
my
0.71
Activations Density 0.030%