INDEX
Explanations
occurrences of the word "leads."
New Auto-Interp
Negative Logits
die
-0.62
po
-0.56
wnątrz
-0.55
I
-0.53
ab
-0.53
اشی
-0.53
filepath
-0.53
款
-0.52
jante
-0.51
ob
-0.51
POSITIVE LOGITS
itſelf
1.23
Houſe
1.19
pleaſure
1.06
Reſ
1.05
ſelves
1.04
Theſe
1.03
houſe
1.03
ſelf
1.02
Monfieur
1.01
reaſon
1.00
Activations Density 0.075%