INDEX
Explanations
references to specific cultural or artistic elements
New Auto-Interp
Negative Logits
shan
-0.16
emez
-0.15
ONGL
-0.15
estate
-0.14
-ves
-0.14
aryl
-0.14
tm
-0.14
odont
-0.13
IFn
-0.13
XmlAttribute
-0.13
POSITIVE LOGITS
Iz
0.28
iz
0.23
Blo
0.22
Part
0.21
Blo
0.20
Coal
0.20
PS
0.20
milit
0.20
IU
0.19
partido
0.19
Activations Density 0.002%