INDEX
Explanations
references to individuals or entities associated with the prefix "Ir"
New Auto-Interp
Negative Logits
Italijanski
-0.45
Климаты
-0.41
copically
-0.38
szerint
-0.37
ായ
-0.37
VisualStyle
-0.37
figliu
-0.37
diterapkan
-0.36
erías
-0.35
tauscht
-0.35
POSITIVE LOGITS
Ir
1.15
Ir
1.06
ir
1.02
Kir
0.96
Kirk
0.96
Kirk
0.90
kir
0.90
ir
0.89
Kir
0.88
hir
0.87
Activations Density 1.687%