INDEX
Explanations
phrases indicating a lack of knowledge or understanding about a subject
New Auto-Interp
Negative Logits
rungsseite
-0.64
vorschaubild
-0.58
sqcup
-0.58
هی
-0.56
beging
-0.56
izzle
-0.56
depth
-0.55
pulseira
-0.55
Cuc
-0.55
ctile
-0.55
POSITIVE LOGITS
known
1.14
Known
1.12
Known
1.05
known
1.01
KNOWN
0.98
KNOWN
0.96
Know
0.91
Know
0.88
know
0.83
Knows
0.77
Activations Density 0.023%