INDEX
Explanations
proper nouns or phrases related to news and events
the end of documents or text passages
New Auto-Interp
Negative Logits
Niet
-0.68
horizont
-0.67
å§
-0.67
fert
-0.65
ãĥ¼ãĥĨ
-0.64
Seym
-0.63
Chero
-0.62
Kurd
-0.61
âĨij
-0.61
intimid
-0.61
POSITIVE LOGITS
lash
0.84
utenberg
0.78
rael
0.76
ELF
0.73
weet
0.72
ullivan
0.71
hi
0.71
rimp
0.71
ourge
0.70
reet
0.70
Activations Density 0.020%