INDEX
Explanations
references to specific entities or nouns, such as organizations, people, or products
New Auto-Interp
Negative Logits
terday
-0.83
ĸļ
-0.76
oÄŁ
-0.74
retrospect
-0.72
htt
-0.72
puter
-0.71
derail
-0.70
craw
-0.69
nsics
-0.68
Conquest
-0.67
POSITIVE LOGITS
reet
1.37
reetings
1.29
ossip
1.28
oliath
1.27
asp
1.26
ourmet
1.26
irlfriend
1.24
CHQ
1.24
opher
1.23
ospels
1.23
Activations Density 8.975%