INDEX
Explanations
variables related to numerical values or constants
New Auto-Interp
Negative Logits
Theſe
-0.98
myſelf
-0.95
ſelf
-0.90
compri
-0.88
Chriſt
-0.87
Lieben
-0.85
ſhall
-0.84
iſt
-0.84
་་
-0.84
Schwe
-0.83
POSITIVE LOGITS
K
1.57
K
1.53
k
1.40
k
1.29
nationaux
0.98
assium
0.95
𝐤
0.92
К
0.91
К
0.90
Nick
0.88
Activations Density 0.139%