INDEX
Explanations
phrases indicating the significance or prevalence of certain characteristics, outcomes, or behaviors
New Auto-Interp
Negative Logits
both
-0.73
moveToFirst
-0.72
muitas
-0.72
both
-0.68
many
-0.67
sometimes
-0.63
многих
-0.62
most
-0.61
muchas
-0.61
many
-0.61
POSITIVE LOGITS
цездатний
0.73
comprised
0.73
consisting
0.72
devoted
0.71
consisted
0.70
consists
0.68
focused
0.67
composed
0.65
UserScript
0.65
InitStruct
0.64
Activations Density 0.358%