INDEX
Explanations
text related to additional features or bonuses
phrases or words that contain the concept of "plus" or addition
New Auto-Interp
Negative Logits
adr
-0.86
arer
-0.76
lied
-0.74
oder
-0.73
ayn
-0.73
jar
-0.71
bris
-0.71
nce
-0.70
atters
-0.69
elly
-0.69
POSITIVE LOGITS
minus
1.05
PLUS
0.92
plus
0.91
cules
0.83
minus
0.74
Ukrain
0.74
Plus
0.71
/-
0.70
Plus
0.69
FontSize
0.69
Activations Density 0.008%