INDEX
Explanations
situational awareness and colors
New Auto-Interp
Negative Logits
Alberta
0.45
North
0.45
News
0.41
Centre
0.40
Center
0.40
'
0.40
lét
0.39
வதேச
0.38
Por
0.38
狰
0.38
POSITIVE LOGITS
vok
0.42
glu
0.41
väx
0.41
vikt
0.40
ાલુ
0.39
সংযুক্ত
0.39
জাহাজ
0.38
rezat
0.38
vadanti
0.38
otvore
0.38
Activations Density 0.000%