INDEX
Explanations
mentions of a specific person named Boris
New Auto-Interp
Negative Logits
essor
-0.84
house
-0.82
erve
-0.79
ese
-0.76
heimer
-0.76
mark
-0.76
ifact
-0.76
ppelin
-0.74
venge
-0.74
isco
-0.72
POSITIVE LOGITS
Dia
0.78
Aval
0.71
Yar
0.70
Isles
0.67
Yel
0.67
Nem
0.67
Eps
0.66
RON
0.66
fusc
0.65
æŃ
0.64
Activations Density 0.039%