INDEX
Explanations
emphasized adjectives expressing degree
New Auto-Interp
Negative Logits
1
-0.77
0
-0.74
scrubs
-0.69
tableFuture
-0.68
Gelb
-0.67
thenReturn
-0.66
abo
-0.66
Вікіпе
-0.65
onel
-0.64
Administr
-0.64
POSITIVE LOGITS
highly
1.04
highly
0.97
Highly
0.95
Highly
0.95
angliski
0.91
)");
0.82
ientras
0.81
InitStruct
0.80
Sehr
0.80
⁸
0.79
Activations Density 0.126%