INDEX
Explanations
conjunctions and logical relations in the text
New Auto-Interp
Negative Logits
تانيه
-0.82
wnętr
-0.81
rrggbb
-0.73
ftate
-0.69
Cæsar
-0.69
yawn
-0.69
Tazama
-0.69
reaſon
-0.67
habet
-0.67
snorkel
-0.67
POSITIVE LOGITS
izing
0.73
ating
0.72
making
0.69
taking
0.69
getting
0.67
transporting
0.66
doing
0.66
ting
0.65
enjoying
0.64
applying
0.63
Activations Density 0.518%