INDEX
Explanations
countries and geopolitical entities
New Auto-Interp
Negative Logits
Tomasz
0.45
}^{-}\0.41
冢
0.41
китай
0.41
Australian
0.40
chinese
0.40
мои
0.39
Peruvian
0.39
itsyn
0.38
inese
0.38
POSITIVE LOGITS
भारता
0.75
humanity
0.74
България
0.69
우리나라
0.69
Humanity
0.65
Britain
0.64
Britain
0.64
America
0.62
mankind
0.61
இந்தியா
0.59
Activations Density 0.008%