INDEX
Explanations
names of people and locations
New Auto-Interp
Negative Logits
}{@-0.92
ValueStyle
-0.90
ModelExpression
-0.88
boutin
-0.86
rungsseite
-0.85
Monfieur
-0.79
שוליים
-0.79
itſelf
-0.78
iſt
-0.78
новниш
-0.77
POSITIVE LOGITS
(
0.47
I
0.45
P
0.44
↵
0.43
&
0.42
H
0.41
sure
0.37
B
0.37
…
0.37
J
0.37
Activations Density 0.070%