INDEX
Explanations
proper nouns
proper nouns, particularly names and locations
New Auto-Interp
Negative Logits
âĶĢâĶĢâĶĢâĶĢ
-0.80
wiret
-0.69
plagiar
-0.68
FISA
-0.65
generational
-0.65
copied
-0.64
conspir
-0.64
clipboard
-0.63
POLIT
-0.62
GROUP
-0.62
POSITIVE LOGITS
ava
1.01
oya
1.01
onda
0.95
hend
0.95
ai
0.94
ua
0.92
onia
0.91
ena
0.90
uda
0.89
atha
0.89
Activations Density 0.524%