INDEX
Explanations
references to notable individuals or roles related to academia and media
New Auto-Interp
Negative Logits
prive
-0.13
arence
-0.13
<--
-0.13
’aut
-0.13
african
-0.12
nonprofit
-0.12
bourne
-0.12
lowercase
-0.12
...↵↵↵
-0.12
ÑĢаÑģ
-0.12
POSITIVE LOGITS
Iranians
0.19
Honour
0.19
Israelis
0.19
honour
0.18
;'
0.18
USA
0.18
Juda
0.18
whilst
0.18
;
0.17
Obama
0.17
Activations Density 0.003%