INDEX
Explanations
the word "pilot"
New Auto-Interp
Negative Logits
-1.09
-0.94
-
-0.91
↵
-0.88
<eos>
-0.85
(
-0.85
-0.84
The
-0.76
P
-0.75
↵↵
-0.74
POSITIVE LOGITS
myſelf
1.92
houſe
1.84
itſelf
1.74
Efq
1.71
ſelf
1.67
Reſ
1.67
Houſe
1.66
raiſ
1.63
purpoſe
1.63
Anſ
1.62
Activations Density 1.117%