INDEX
Explanations
phrases starting with the word "These"
the word "these."
New Auto-Interp
Negative Logits
Adds
-0.74
obar
-0.70
rupted
-0.67
obook
-0.66
let
-0.66
mma
-0.65
Ģ
-0.64
ossier
-0.64
ga
-0.64
terness
-0.64
POSITIVE LOGITS
guys
1.31
aren
1.29
are
1.28
kinds
1.16
weren
1.13
days
1.11
sorts
1.06
dudes
1.05
were
1.05
fellows
1.01
Activations Density 0.090%