INDEX
Explanations
mentions of the word "Smith" in a document
mentions of the name "Smith."
New Auto-Interp
Negative Logits
ADRA
-0.76
ktop
-0.67
ça
-0.64
stract
-0.62
ATING
-0.62
UGE
-0.61
ctory
-0.60
Greenpeace
-0.58
PDATE
-0.58
sympt
-0.58
POSITIVE LOGITS
sonian
1.68
son
0.98
sburg
0.92
Barney
0.89
smanship
0.89
ies
0.87
gren
0.86
anity
0.84
field
0.84
inelli
0.84
Activations Density 0.022%