INDEX
Explanations
references to environmental or health-related issues
New Auto-Interp
Negative Logits
Calvo
-0.83
foglal
-0.76
poffe
-0.68
riuscito
-0.66
nsf
-0.65
Integrity
-0.65
пта
-0.64
5
-0.64
T
-0.63
capri
-0.63
POSITIVE LOGITS
toward
1.69
toward
1.63
Toward
1.63
Toward
1.56
towards
1.51
Towards
1.50
Towards
1.49
towards
1.42
TOW
1.10
hacia
1.10
Activations Density 0.062%