INDEX
Explanations
phrases related to specific people, such as politicians or public figures
references to specific individuals or entities in a political or social context
New Auto-Interp
Negative Logits
ãĤ¼ãĤ¦ãĤ¹
-0.81
ãĤ½
-0.77
@#&
-0.77
FIELD
-0.74
ãĤµãĥ¼ãĥĨãĤ£ãĥ¯ãĥ³
-0.72
Âł Âł Âł Âł Âł Âł Âł Âł
-0.68
Predator
-0.68
è»
-0.67
ä¹ĭ
-0.67
FUL
-0.67
POSITIVE LOGITS
enger
1.05
aida
1.04
hou
0.96
artisan
0.89
hak
0.89
mus
0.88
hi
0.88
ht
0.85
cius
0.84
anyahu
0.84
Activations Density 0.039%