INDEX
Explanations
phrases indicating conditional situations or outcomes
New Auto-Interp
Negative Logits
itſelf
-1.07
myſelf
-1.01
themſelves
-0.96
raiſ
-0.95
ſever
-0.93
ſeveral
-0.92
whoſe
-0.91
purpoſe
-0.88
ſtate
-0.87
IntoConstraints
-0.86
POSITIVE LOGITS
:
0.90
:
0.71
namely
0.64
namely
0.64
is
0.60
0.59
Namely
0.58
的是
0.57
那就是
0.57
:
0.54
Activations Density 0.683%