INDEX
Explanations
document structure indicators or metadata
New Auto-Interp
Negative Logits
al
-0.70
si
-0.70
ki
-0.69
if
-0.68
do
-0.67
,
-0.67
con
-0.66
ne
-0.66
in
-0.66
mu
-0.66
POSITIVE LOGITS
itſelf
1.27
myſelf
1.20
Anſ
1.19
himſelf
1.16
Jefus
1.15
ſelves
1.15
Houſe
1.11
Eſ
1.06
themſelves
1.06
Efq
1.06
Activations Density 0.344%