INDEX
Explanations
locations and features related to places and structures
New Auto-Interp
Negative Logits
Efq
-0.97
Jefus
-0.94
myſelf
-0.94
himſelf
-0.93
whoſe
-0.91
StructEnd
-0.90
Monfieur
-0.89
reaſon
-0.89
Chrift
-0.88
chofe
-0.88
POSITIVE LOGITS
where
0.66
where
0.56
pure
0.55
Where
0.53
mat
0.53
Where
0.51
multi
0.49
lots
0.48
sign
0.48
waar
0.47
Activations Density 0.413%