INDEX
Explanations
proper names or titles mentioned in text
references to names and titles
New Auto-Interp
Negative Logits
onga
-0.79
ometers
-0.71
ersive
-0.65
oun
-0.64
erker
-0.63
gif
-0.62
aukee
-0.62
eren
-0.62
contrace
-0.61
umat
-0.61
POSITIVE LOGITS
"#
0.95
''
0.86
"@
0.82
onyms
0.74
"(
0.72
``
0.71
name
0.70
é£
0.70
Osama
0.70
"_
0.69
Activations Density 0.104%