INDEX
Explanations
No Explanations Found
New Auto-Interp
Negative Logits
<bos>
-0.52
ſtate
-0.47
houſe
-0.39
LogUtils
-0.38
avenir
-0.37
Retry
-0.37
outheast
-0.37
agujas
-0.37
ſche
-0.36
Hö
-0.36
POSITIVE LOGITS
was
1.10
was
1.06
Was
0.95
were
0.91
Was
0.90
were
0.86
WAS
0.84
Twas
0.79
WERE
0.79
была
0.78
Activations Density 0.000%
No Known Activations
This feature has no known activations.