INDEX
Explanations
proper nouns associated with geopolitical conflicts
names of places, entities, and organizations involved in conflicts or significant relationships
New Auto-Interp
Negative Logits
©¶æ¥µ
-0.67
ŃĶ
-0.66
ãĥķãĤ©
-0.64
uala
-0.61
ank
-0.56
guide
-0.53
NRS
-0.53
havoc
-0.52
Shape
-0.51
ouf
-0.50
POSITIVE LOGITS
])
0.59
factions
0.58
}\
0.57
athed
0.55
doms
0.54
versus
0.54
equals
0.54
isans
0.53
entary
0.53
})
0.53
Activations Density 0.373%