INDEX
Explanations
proper nouns referring to a person named "Tony"
the name "Tony."
New Auto-Interp
Negative Logits
INESS
-0.79
cipline
-0.74
station
-0.71
Magikarp
-0.71
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.71
ãģ¦
-0.68
doors
-0.68
ebook
-0.68
actionGroup
-0.68
pmwiki
-0.68
POSITIVE LOGITS
Tony
0.99
Romo
0.94
Blair
0.93
Abbott
0.92
Tony
0.87
Benn
0.80
Sop
0.80
isable
0.79
neau
0.79
Stark
0.77
Activations Density 0.008%