INDEX
Explanations
instances of the word "this" and its variants
New Auto-Interp
Negative Logits
nar
-0.17
eric
-0.15
izard
-0.14
uels
-0.14
rels
-0.13
ello
-0.13
sg
-0.13
same
-0.13
.setdefault
-0.13
personne
-0.13
POSITIVE LOGITS
ones
0.40
latest
0.37
Latest
0.30
latest
0.29
particular
0.29
æľĢæĸ°
0.28
Latest
0.26
Ones
0.26
/latest
0.25
batch
0.25
Activations Density 0.066%