INDEX
Explanations
references to individuals and their contributions in academic or research contexts
New Auto-Interp
Negative Logits
themſelves
-1.03
itſelf
-1.01
ſelf
-1.01
faſt
-1.00
myſelf
-0.99
auffi
-0.97
ujednoznacz
-0.95
Efq
-0.95
Beſ
-0.95
ſelves
-0.94
POSITIVE LOGITS
S
0.60
.
0.58
l
0.48
co
0.48
mb
0.48
L
0.48
C
0.47
l
0.47
b
0.47
g
0.47
Activations Density 0.269%