INDEX
Explanations
phrases that indicate meaning or definitions
New Auto-Interp
Negative Logits
يتيمه
-0.55
oprot
-0.55
vrijwilli
-0.51
utford
-0.51
PerformLayout
-0.50
fromnode
-0.50
طني
-0.48
ThroughAttribute
-0.47
paraíso
-0.46
odyne
-0.46
POSITIVE LOGITS
mean
3.73
MEAN
3.42
Mean
3.39
mean
3.31
Mean
3.16
MEAN
3.09
Means
2.68
meant
2.67
means
2.65
Means
2.58
Activations Density 0.105%