INDEX
Explanations
occurrences of the word "just" and its variations
New Auto-Interp
Negative Logits
cky
-0.17
ares
-0.15
dra
-0.14
ctl
-0.14
err
-0.14
itals
-0.14
iland
-0.14
626
-0.13
only
-0.13
éĸ¢
-0.13
POSITIVE LOGITS
zes
0.16
part
0.16
beginning
0.15
inium
0.15
Beginning
0.15
oldur
0.15
another
0.14
Means
0.14
sembl
0.14
ÑĤеÑĢн
0.14
Activations Density 0.061%