INDEX
Explanations
phrases indicating causation or reasoning
New Auto-Interp
Negative Logits
ÑĤа
-0.16
¹
-0.15
ERSION
-0.14
коÑĤ
-0.14
erialize
-0.14
ấp
-0.14
ecd
-0.14
onical
-0.14
SELF
-0.13
mana
-0.13
POSITIVE LOGITS
lack
0.17
age
0.16
éal
0.15
dup
0.14
Oswald
0.14
ληÏĤ
0.14
eof
0.14
Fletcher
0.14
lease
0.13
#__
0.13
Activations Density 0.068%