INDEX
Explanations
accents in text
repeated phrases or expressions
New Auto-Interp
Negative Logits
raints
-0.87
matic
-0.82
orial
-0.73
urated
-0.69
primates
-0.68
writers
-0.68
enegger
-0.65
ulative
-0.65
apes
-0.62
utra
-0.60
POSITIVE LOGITS
âĶĢâĶĢ
1.16
ï¸ı
0.96
ĺ
0.91
ľ
0.90
¸
0.89
ª
0.89
Ĺ
0.88
ļ
0.87
fter
0.86
¼
0.86
Activations Density 0.233%