INDEX
Explanations
mentions of specific geographical locations, events, and individuals related to a particular region
New Auto-Interp
Negative Logits
ngth
-0.71
DonaldTrump
-0.70
ilial
-0.69
Schn
-0.68
NETWORK
-0.68
ensional
-0.61
guid
-0.61
Magikarp
-0.61
awaru
-0.61
oblig
-0.61
POSITIVE LOGITS
ished
1.05
aby
1.05
er
0.99
lee
0.97
burn
0.93
ards
0.92
ishing
0.91
lett
0.89
ters
0.89
iation
0.84
Activations Density 0.018%