INDEX
Explanations
occurrences of the word "No" in various contexts
New Auto-Interp
Negative Logits
rik
-0.17
gh
-0.16
rent
-0.15
æ©
-0.15
-ÑĤо
-0.15
ritis
-0.15
ycz
-0.15
patrick
-0.14
ά
-0.14
rol
-0.14
POSITIVE LOGITS
xious
0.32
longer
0.31
isy
0.27
veau
0.25
ises
0.25
zzle
0.24
matter
0.24
doubt
0.23
Longer
0.23
things
0.23
Activations Density 0.120%