INDEX
Explanations
mentions of dates or years
New Auto-Interp
Negative Logits
orz
-0.15
ãģĺ
-0.14
dow
-0.14
adam
-0.14
695
-0.14
ģı
-0.14
cket
-0.13
cann
-0.13
Doyle
-0.13
mand
-0.13
POSITIVE LOGITS
IDX
0.15
ua
0.15
Premi
0.15
fon
0.15
ož
0.14
importe
0.14
ima
0.14
ermal
0.14
kvinna
0.14
tiener
0.14
Activations Density 0.045%