INDEX
Explanations
the word "just" in various forms and contexts
New Auto-Interp
Negative Logits
pleaſure
-0.87
Efq
-0.85
rungsseite
-0.83
Reſ
-0.79
jspx
-0.78
ſelf
-0.77
Majefty
-0.76
Jefus
-0.75
ſelves
-0.74
aarrggbb
-0.73
POSITIVE LOGITS
like
0.87
barely
0.67
simply
0.64
enough
0.63
plain
0.63
before
0.62
as
0.62
about
0.58
zoals
0.54
such
0.54
Activations Density 0.069%