INDEX
Explanations
the word "this" in various contexts throughout the document
New Auto-Interp
Negative Logits
ucht
-0.16
passion
-0.15
ones
-0.15
hai
-0.14
Sob
-0.14
Ones
-0.14
suspect
-0.14
ator
-0.14
hip
-0.13
inv
-0.13
POSITIVE LOGITS
æ¹
0.19
utsch
0.16
illes
0.15
apes
0.15
csi
0.15
eroon
0.15
ameda
0.15
opis
0.14
-spe
0.14
oud
0.14
Activations Density 0.055%