INDEX
Explanations
the word "weak" and words that mean the opposite of weak
New Auto-Interp
Negative Logits
myſelf
-1.63
Monfieur
-1.59
itſelf
-1.58
متعلقه
-1.52
―――――
-1.51
ſeveral
-1.41
pleaſure
-1.40
Reſ
-1.37
+#+#
-1.36
faſt
-1.34
POSITIVE LOGITS
A
0.82
"
0.81
P
0.80
T
0.79
0.76
H
0.75
v
0.75
I
0.73
M
0.73
m
0.72
Activations Density 3.495%