INDEX
Explanations
proper nouns related to people, places, and organizations
references to organizations and groups within a specific context
New Auto-Interp
Negative Logits
etter
-0.63
nings
-0.63
enegger
-0.61
asio
-0.60
etts
-0.59
istically
-0.59
rooms
-0.58
illon
-0.58
omore
-0.57
liest
-0.57
POSITIVE LOGITS
metic
0.60
ãģ£
0.53
é»Ĵ
0.51
ãģª
0.50
Stun
0.50
د
0.50
ãģ
0.50
åIJ
0.50
ÙĬ
0.49
ש
0.49
Activations Density 0.709%