INDEX
Explanations
phrases related to documentaries or reports about specific events or people
the pronoun "it" in various contexts
New Auto-Interp
Negative Logits
Priv
-0.62
warning
-0.60
Priv
-0.58
scribe
-0.56
priv
-0.55
distraction
-0.54
devices
-0.52
palate
-0.52
dictatorship
-0.50
Guinea
-0.49
POSITIVE LOGITS
alian
1.32
chy
1.29
self
1.21
unes
1.06
alia
1.00
anium
0.93
ÃĥÃĤ
0.93
atic
0.90
asca
0.90
seems
0.89
Activations Density 0.182%