INDEX
Explanations
Sentences concerning individuals' actions or statuses within specific organizations or institutions
symbols and special characters in the text
New Auto-Interp
Negative Logits
Valiant
-0.79
Golem
-0.77
scattering
-0.71
Naz
-0.70
nod
-0.69
bye
-0.69
Paras
-0.68
scatter
-0.66
Sakuya
-0.65
Nam
-0.65
POSITIVE LOGITS
£
0.95
¹
0.92
âĢł
0.91
Į
0.88
¢
0.87
ISIS
0.87
§
0.84
taboola
0.83
ı
0.82
Colorado
0.81
Activations Density 0.501%