INDEX
Explanations
references to a specific entity or name, likely related to a scholarly or formal context
New Auto-Interp
Negative Logits
poffe
-1.10
raiſ
-1.03
myſelf
-1.02
poffible
-0.94
ViewFeatures
-0.92
ſhe
-0.91
iſt
-0.90
juſt
-0.89
faſt
-0.86
becauſe
-0.86
POSITIVE LOGITS
Sch
3.13
Sch
3.10
sch
2.89
sch
2.69
SCH
2.38
SCH
2.23
scho
1.53
Scho
1.51
Schl
1.50
ensch
1.48
Activations Density 0.042%