INDEX
Explanations
repeated instances of the word "que"
New Auto-Interp
Negative Logits
purpoſe
-0.67
autorytatywna
-0.66
pleaſure
-0.65
laſt
-0.64
resourceCulture
-0.61
Younger
-0.60
ſtre
-0.58
houſe
-0.58
ſeveral
-0.57
iprot
-0.57
POSITIVE LOGITS
what
0.90
what
0.81
hvad
0.74
آنچه
0.74
WHAT
0.70
What
0.67
What
0.65
Čo
0.65
لما
0.64
مما
0.63
Activations Density 0.118%