INDEX
Explanations
the word "just" in various contexts
just followed by modifier
New Auto-Interp
Negative Logits
pleaſure
-0.89
Efq
-0.87
ſelf
-0.85
Majefty
-0.82
ſelves
-0.80
Reſ
-0.80
LookAnd
-0.79
jspx
-0.77
Diſ
-0.74
Jefus
-0.72
POSITIVE LOGITS
plain
0.78
like
0.70
barely
0.62
recently
0.57
so
0.55
such
0.54
kidding
0.53
enough
0.53
about
0.52
单纯
0.52
Activations Density 0.091%