INDEX
Explanations
references to individuals and their titles or positions
New Auto-Interp
Negative Logits
ัà¸Ĺ
-0.16
ward
-0.15
ì°
-0.15
SKTOP
-0.15
.protobuf
-0.14
edis
-0.14
_RS
-0.14
.usage
-0.14
WARD
-0.14
aeda
-0.14
POSITIVE LOGITS
Ol
0.21
Innoc
0.20
Fest
0.20
Collins
0.19
ony
0.19
ol
0.19
Fol
0.18
Barr
0.17
Chin
0.17
Bol
0.17
Activations Density 0.036%