INDEX
Explanations
references to complex legal or political issues
New Auto-Interp
Negative Logits
ه
-0.96
ing
-0.94
Parke
-0.78
ی
-0.77
Colbert
-0.76
es
-0.73
er
-0.72
ené
-0.71
ená
-0.70
tka
-0.69
POSITIVE LOGITS
findpost
0.97
swag
0.95
Mousse
0.91
Rabin
0.91
UpInside
0.87
Durand
0.84
Washer
0.83
omány
0.82
suit
0.82
Oss
0.80
Activations Density 0.496%