INDEX
Explanations
analogies or comparisons in various contexts
New Auto-Interp
Negative Logits
myſelf
-1.18
uſed
-1.08
fubject
-1.08
cauſe
-1.05
himſelf
-1.05
raiſ
-1.04
ſtate
-1.01
Majefty
-1.01
purpoſe
-1.00
preſent
-0.98
POSITIVE LOGITS
like
0.84
akin
0.78
a
0.78
like
0.70
Like
0.70
就像是
0.67
an
0.63
resembles
0.63
Like
0.62
就像
0.62
Activations Density 0.306%