INDEX
Explanations
conditional phrases and exceptions in narratives
New Auto-Interp
Negative Logits
ustral
-0.17
raith
-0.16
rieg
-0.14
âĶģâĶģâĶģâĶģâĶģâĶģâĶģâĶģâĶģâĶģâĶģâĶģâĶģâĶģâĶģâĶģ
-0.14
irc
-0.14
Occurred
-0.13
βε
-0.13
phalt
-0.13
lem
-0.13
AIT
-0.13
POSITIVE LOGITS
apart
0.20
aside
0.18
azen
0.16
ebek
0.16
пÑĢоÑĩ
0.15
Apart
0.14
besides
0.14
istes
0.14
mö
0.14
aside
0.14
Activations Density 0.033%