INDEX
Explanations
questions or statements involving clarifications and specifics
New Auto-Interp
Negative Logits
leſs
-0.60
kleid
-0.54
anera
-0.54
odyear
-0.54
Landschaft
-0.54
ptid
-0.53
enal
-0.53
ValueStyle
-0.51
draußen
-0.51
odle
-0.50
POSITIVE LOGITS
WHICH
1.03
Which
0.99
Which
0.90
which
0.85
which
0.83
Wich
0.77
wich
0.74
hich
0.66
Wich
0.64
lesquels
0.61
Activations Density 0.250%