INDEX
Explanations
proper nouns
items or events associated with significant individuals or roles in society
New Auto-Interp
Negative Logits
ELF
-0.61
ãĥ¥
-0.61
(>
-0.59
ESA
-0.59
ize
-0.58
Different
-0.56
gress
-0.55
ORT
-0.55
ente
-0.55
TOR
-0.53
POSITIVE LOGITS
wrote
0.96
joins
0.95
remembers
0.91
teaches
0.90
joined
0.90
became
0.89
testified
0.88
took
0.88
reacted
0.87
flew
0.85
Activations Density 0.167%